arxiv:1902.03520v1 [cs.se] 10 feb 2019 · 2019-02-12 · swarm debugging: the collective...

50
arXiv Computer Science manuscript No. (will be inserted by the editor) Swarm Debugging: the Collective Intelligence on Interactive Debugging Fabio Petrillo 1 , Yann-Ga¨ el Gu´ eh´ eneuc 3 , Marcelo Pimenta 2 , Carla Dal Sasso Freitas 2 , Foutse Khomh 4 Received: date / Accepted: date Abstract One of the most important tasks in software maintenance is debug- ging. To start an interactive debugging session, developers usually set break- points in an integrated development environment and navigate through differ- ent paths in their debuggers. We started our work by asking what debugging information is useful to share among developers and study two pieces of in- formation: breakpoints (and their locations) and sessions (debugging paths). To answer our question, we introduce the Swarm Debugging concept to frame the sharing of debugging information, the Swarm Debugging Infrastructure (SDI) with which practitioners and researchers can collect and share data about developers’ interactive debugging sessions, and the Swarm Debugging Global View (GV) to display debugging paths. Using the SDI, we conducted a large study with professional developers to understand how developers set breakpoints. Using the GV, we also analyzed professional developers in two studies and collected data about their debugging sessions. Our observations and the answers to our research questions suggest that sharing and visualizing debugging data can support debugging activities. Keywords Debugging, debugging effort, software visualization, empirical studies, distributed systems, information foraging 1 Introduction Debug. To detect, locate, and correct faults in a computer program. Techniques include the use of breakpoints, desk checking, dumps, in- spection, reversible execution, single-step operations, and traces. —IEEE Standard Glossary of SE Terminology, 1990 1 Universit´ e du Queb´ ec ` a Chicoutimi, 2 Federal University of Rio Grande do Sul, 3 Concordia University, 4 Polytechnique Montreal, Canada arXiv:1902.03520v1 [cs.SE] 10 Feb 2019

Upload: others

Post on 24-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

arXiv Computer Science manuscript No(will be inserted by the editor)

Swarm Debugging the Collective Intelligence onInteractive Debugging

Fabio Petrillo1 Yann-Gael Gueheneuc3Marcelo Pimenta2 Carla Dal SassoFreitas2 Foutse Khomh4

Received date Accepted date

Abstract One of the most important tasks in software maintenance is debug-ging To start an interactive debugging session developers usually set break-points in an integrated development environment and navigate through differ-ent paths in their debuggers We started our work by asking what debugginginformation is useful to share among developers and study two pieces of in-formation breakpoints (and their locations) and sessions (debugging paths)To answer our question we introduce the Swarm Debugging concept to framethe sharing of debugging information the Swarm Debugging Infrastructure(SDI) with which practitioners and researchers can collect and share dataabout developersrsquo interactive debugging sessions and the Swarm DebuggingGlobal View (GV) to display debugging paths Using the SDI we conducteda large study with professional developers to understand how developers setbreakpoints Using the GV we also analyzed professional developers in twostudies and collected data about their debugging sessions Our observationsand the answers to our research questions suggest that sharing and visualizingdebugging data can support debugging activities

Keywords Debugging debugging effort software visualization empiricalstudies distributed systems information foraging

1 Introduction

Debug To detect locate and correct faults in a computer programTechniques include the use of breakpoints desk checking dumps in-spection reversible execution single-step operations and tracesmdashIEEE Standard Glossary of SE Terminology 1990

1Universite du Quebec a Chicoutimi 2Federal University of Rio Grande do Sul 3ConcordiaUniversity 4Polytechnique Montreal Canada

arX

iv1

902

0352

0v1

[cs

SE

] 1

0 Fe

b 20

19

2Please give a shorter version with authorrunning and titlerunning prior to maketitle

Debugging is a common activity during software development mainte-nance and evolution [1] Developers use debugging tools to detect locateand correct faults Debugging tools can be interactive or automated

Interactive debugging tools aka debuggers such as sdb [2] dbx [3] orgdb [4] have been used by developers for decades Modern debuggers are of-ten integrated in interactive environments eg DDD [5] or the debuggers ofEclipse NetBeans IntelliJ IDEA and Visual Studio They allow developersto navigate through the code look for locations to place breakpoints and stepoverinto statements While stepping debuggers can traverse method invoca-tions and allow developers to toggle one or more breakpoints and stoprestartexecutions Thus they allow developers to gain knowledge about programsand the causes of faults to fix them

Automated debugging tools require both successful and failed runs and donot support programs with interactive inputs [6] Consequently they have notbeen widely adopted in practice Moreover automated debugging approachesare often unable to indicate the ldquotruerdquo locations of faults [7] Other hybridtools such as slicing and query languages may help developers but there isinsufficient evidence that they help developers during debugging

Although Integrated Development Environments (IDEs) encourage devel-opers to work collaboratively exchanging code through Git or assessing codequality with SonarQube one activity remains solitary debugging Debuggingis still an individual activity during which a developer explores the sourcecode of the system under development or maintenance using the debugger pro-vided by an IDE She steps into hundreds of statements and traverses dozensof method invocations painstakingly to gain an understanding of the systemMoreover within modern interactive debugging tools such as those includedin Eclipse or IntelliJ a debugging session cannot start if the developer does notset a breakpoint Consequently it is mandatory to set at least one breakpointto launch an interactive debugging session

Several studies have shown that developers spend over two-thirds of theirtime investigating code and one-third of this time is spent in debugging [8910] However developers do not reuse the knowledge accumulated duringdebugging directly When debugging is over they loose track of the pathsthat they followed into the code and of the breakpoints that they toggledMoreover they cannot share this knowledge with other developers easily If afault re-appears in the system or if a new fault similar to a previous one islogged the developer must restart the exploration from the beginning

In fact debugging tools have not changed substantially in the last 30 yearsdevelopersrsquo primary tools for debugging their programs are still breakpoint de-buggers and print statements Indeed changing the way developers debug theirprograms is one of the main motivations of our work We are convinced thata collaborative way of using contextual information of (previous) debuggingsessions to support (future) debugging activities is a very interesting approach

Roszligler [7] advocated for the development of a new family of debuggingtools that use contextual information To build context-aware debugging toolsresearchers need an understanding of developersrsquo debugging sessions to use

Swarm Debugging the Collective Intelligence on Interactive Debugging 3

this information as context for their debugging Thus researchers need toolsto collect and share data about developersrsquo debugging sessions

Maalej et al [11] observed that capturing contextual information requiresthe instrumentation of the IDE and continuous observation of the developersrsquoactivities within the IDE Studies by Storey et al [12] showed that the newergeneration of developers who are proficient in social media are comfortablewith sharing such information Developers are nowadays open transparenteager to share their knowledge and generally willing to allow informationabout their activities to be collected by the IDEs automatically [12]

Considering this context we introduce the concept of Swarm Debug-ging (SD) to (1) capture debugging contextual information (2) share it and(3) reuse it across debugging sessions and developers We build the concept ofSwarm Debugging based on the idea that many developers performing debug-ging sessions independently are in fact building collective knowledge whichcan be shared and reused with adequate support Thus we are convinced thatdevelopers need support to collect store and share this knowledge ie in-formation from and about their debugging sessions including but not limitedto breakpoints locations visited statements and traversed paths To providesuch support Swarm Debugging includes (i) the Swarm Debugging Infrastruc-ture (SDI) with which practitioners and researchers can collect and share dataabout developersrsquo interactive debugging sessions and (ii) the Swarm Debug-ging Global View (GV) to display debugging paths

As a consequence of adopting SD an interesting question emerges whatdebugging information is useful to share among developers to ease debuggingDebugging provides a lot of information which could be possibly considereduseful to improve software comprehension but we are particularly interestedin two pieces of debugging information breakpoints (and their locations) andsessions (debugging paths) because these pieces of information are essentialfor the two main activities during debugging setting breakpoints and steppinginoverout statements

In general developers initiate an interactive debugging session by settinga breakpoint Setting a breakpoint is one of the most frequently used fea-tures of IDEs [13] To decide where to set a breakpoint developers use theirobservations recall their experiences with similar debugging tasks and formu-late hypotheses about their tasks [14] Tiarks and Rohms [15] observed thatdevelopers have difficulties in finding locations for setting the breakpointssuggesting that this is a demanding activity and that supporting developersto set appropriate breakpoints could reduce debugging effort

We conducted two sets of studies with the aim of understanding how de-velopers set breakpoints and navigate (step) during debugging sessions Inobservational studies we collected and analyzed more than 10 hours of devel-opersrsquo videos in 45 debugging sessions performed by 28 different independentdevelopers containing 307 breakpoints on three software systems These ob-servational studies help us understand how developers use breakpoints (RQ1to RQ4)

4Please give a shorter version with authorrunning and titlerunning prior to maketitle

We also conducted with 30 professional developers two studies a qualitativeevaluation and a controlled experiment to assess whether debugging sessionsshared through our Global View visualisation support developers in theirdebugging tasks and is useful for sharing debugging tasks among developers(R5 and RQ6) We collected participantsrsquo answers in electronic forms and morethan 3 hours of debugging sessions on video

This paper has the following contributions

ndash We introduce a novel approach for debugging named Swarm Debugging(SD) based on the concept of Swarm Intelligence and Information ForagingTheory

ndash We present an infrastructure the Swarm Debugging Infrastructure (SDI)to gather store and share data about interactive debugging activities tosupport SD

ndash We provide evidence about the relation between tasksrsquo elapsed time de-velopersrsquo expertise breakpoints setting and debugging patterns

ndash We present a new visualisation technique Global View (GV) built onshared debugging sessions by developers to ease debugging

ndash We provide evidence about the usefulness of sharing debugging session toease developersrsquo debugging

This paper extends our previous works [161718] as follows First we sum-marize the main characteristics of the Swarm Debugging approach providinga theoretical foundation to Swarm Debugging using Swarm Intelligence andInformation Foraging Theory Second we present the Swarm Debugging Infras-tructure (SDI) Third we perform an experiment on the debugging behaviorof 30 professional developers to evaluate if sharing debugging sessions supportsadequately their debugging tasks

The remainder of this article is organized as follows Section 2 providessome fundamentals of debugging and the foundations of SD the conceptsof swarm intelligence and information foraging theory Section 3 describesour approach and its implementation the Swarm Debugging InfrastructureSection 6 presents an experiment to assess the benefits that our SD approachcan bring to developers and Section 5 reports two experiments that wereconducted using SDI to understand developers debugging habits Next Section7 discusses implications of our results while Section 8 presents threats to thevalidity of our study Section 9 summarizes related work and finally Section10 concludes the paper and outlines future work

2 Background

This section provides background information about the debugging activityand setting breakpoints In the following we use failures as unintended be-haviours of a program ie when the program does something that it shouldnot and faults as the incorrect statements in source code causing failuresThe purpose of debugging is to locate and correct faults hence to fix failures

Swarm Debugging the Collective Intelligence on Interactive Debugging 5

21 Debugging and Interactive Debugging

The IEEE Standard Glossary of Software Engineering Terminology (see thedefinition at the beginning of Section 1) defines debugging as the act of de-tecting locating and correcting bugs in a computer program Debugging tech-niques include the use of breakpoints desk checking dumps inspection re-versible execution single-step operations and traces

Araki et al [19] describe debugging as a process where developers makehypotheses about the root-cause of a problem or defect and verify these hy-potheses by examining different parts of the source code of the program

Interactive debugging consists of using a tool ie a debugger to detectlocate and correct a fault in a program It is a process also known as programanimation stepping or following execution [20] Developers often refer to thisprocess simply as debugging because several IDEs provide debuggers to sup-port debugging However it must be noted that while debugging is the processof finding faults interactive debugging is one particular debugging approachin which developers use interactive tools Expressions such as interactive de-bugging stepping and debugging are used interchangeably and there is not yeta consensus on what is the best name for this process

22 Breakpoints and Supporting Mechanisms

Generally breakpoints allow pausing intentionally the execution of a programfor debugging purposes a means of acquiring knowledge about a program dur-ing its execution for example to examine the call stack and variable valueswhen the control flow reaches the locations of the breakpoints Thus a break-point indicates the location (line) in the source code of a program where apause occurs during its execution

Depending on the programming language its run-time environment (inparticular the capabilities of its virtual machines if any) and the debuggersdifferent types of breakpoints may be available to developers These types in-clude static breakpoints [21] that pause unconditionally the execution of aprogram and dynamic breakpoints [22] that pause depending on some con-ditions or threads or numbers of hits

Other types of breakpoints include watchpoints that pause the executionwhen a variable being watched is read andndashor written IDEs offer the meansto specify the different types of breakpoints depending on the programminglanguages and their run-time environment Fig 1-A and 1-B show examples ofstatic and dynamic breakpoints in Eclipse In the rest of this paper we focuson static breakpoints because they are the most used of all types [14]

There are different mechanisms for setting a breakpoint within the code

6Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 1 Setting a static breakpoint (A) and a conditional breakpoint (B)using Eclipse IDE

ndash GUI Most IDEs or browsers offer a visual way of adding a breakpoint usu-ally by clicking at the beginning of the line on which to set the breakpointChrome1 Visual Studio2 IntelliJ 3 and Xcode4

ndash Command line Some programming languages offer debugging tools on thecommand line so an IDE is not necessary to debug the code JDB5 PDB6and GDB7

ndash Code Some programming languages allow using syntactical elements to setbreakpoints as they were lsquoannotationsrsquo in the code This approach oftenonly supports the setting of a breakpoint and it is necessary to use itin conjunction with the command line or GUI Some examples are Rubydebugger8 Firefox 9 and Chrome10

There is a set of features in a debugger that allows developers to control theflow of the execution within the breakpoints ie Call Stack features whichenable continuing or stepping

A developer can opt for continuing in which case the debugger resumesexecution until the next breakpoint is reached or the program exits Con-versely stepping allows the developer to run step by step the entire program

1httpsdevelopersgooglecomwebtoolschrome-devtoolsjavascriptadd-breakpoints

2httpsmsdnmicrosoftcomen-uslibrary5557y8b4aspx

3httpswwwjetbrainscomhelpidea20163debugger-basicshtml

4httpjeffreysambellscom20140114using-breakpoints-in-xcode

5httpdocsoraclecomjavase7docstechnotestoolswindowsjdbhtml

6httpsdocspythonorg2librarypdbhtml

7ftpftpgnuorgoldgnuManualsgdb511html nodegdb 37html

8httpsgithubcomcldwalkerdebugger

9httpsdevelopermozillaorgpt-BRdocsWebJavaScriptReferenceStatementsdebugger

10httpsdevelopersgooglecomwebtoolschrome-devtoolsjavascriptadd-breakpoints

Swarm Debugging the Collective Intelligence on Interactive Debugging 7

flow The definition of a step varies across programming languages and debug-gers but it generally includes invoking a method and executing a statementWhile Stepping a developer can navigate between steps using the followingcommands

ndash Step Over the debugger steps over a given line If the line contains afunction then the function is executed and the result returned withoutstepping through each of its lines

ndash Step Into the debugger enters the function at the current line and continuestepping from there line-by-line

ndash Step Out this action would take the debugger back to the line where thecurrent function was called

To start an interactive debugging session developers set a breakpoint Ifnot the IDE would not stop and enter its interactive mode For exampleEclipse IDE automatically opens the ldquoDebugging Perspectiverdquo when executionhits a breakpoint A developer can run a system in debugging mode withoutsetting breakpoints but she must set a breakpoint to be able to stop theexecution step in and observe variable states Briefly there is no interactivedebugging session without at least one breakpoint set in the codeFinally some debuggers allow debugging remotely for example to performhot-fixes or to test mobile applications and systems operating in remote con-figurations

23 Self-organization and Swarm Intelligence

Self-organization is a concept emerged from Social Sciences and Biology and itis defined as the set of dynamic mechanisms enabling structures to appear atthe global level of a system from interactions among its lower-level componentswithout being explicitly coded at the lower levels Swarm intelligence (SI)describes the behavior resulting from the self-organization of social agents(as insects) [23] Ant nests and the societies that they house are examples ofSI [24] Individual ants can only perform relatively simple activities yet thewhole colony can collectively accomplish sophisticated activities Ants achieveSI by exchanging information encoded as chemical signalsmdashpheromones egindicating a path to follow or an obstacle to avoid

Similarly SI could be used as a metaphor to understand or explain thedevelopment of a multiversion large and complex software systems built bysoftware teams Individual developers can usually perform activities withouthaving a global understanding of the whole system [25] In a birdrsquos eye viewsoftware development is analogous to some SI in which groups of agents in-teracting locally with one another and with their environment and follow-ing simple rules lead to the emergence of global behaviors previously un-knownimpossible to the individual agents We claim that the similarities be-tween the SI of ant nests and complex software systems are not a coincidenceCockburn [26] suggested that the best architectures requirements and designs

8Please give a shorter version with authorrunning and titlerunning prior to maketitle

emerge from self-organizing developers growing in steps and following theirchanging knowledge and the changing wishes of the user community ie atypical example of swarm intelligence

Dev1

Dev2

Dev3

DevN

VisualisationsSearching Tools

Recommendation Systems

Single Debugging Session Crowd Debugging Sessions Debugging Information

Positive feedback

Collect data Store data

Transform information

A B C

D

Fig 2 Overview of the Swarm Debugging approach

24 Information Foraging

Information Foraging Theory (IFT) is based on the optimal foraging theorydeveloped by Pirolli and Card [27] to understand how people search for infor-mation IFT is rooted in biology studies and theories of how animals hunt forfood It was extended to debugging by Lawrance et al[27]

However no previous work proposes the sharing of knowledge related todebugging activities Differently from works that use IFT on a model onepreyone predator [28] we are interested in many developers working inde-pendently in many debugging sessions and sharing information to allow SI toemerge Thus debugging becomes a foraging process in a SI environment

These conceptsmdashSI and IFTmdashhave led to the design of a crowd approachapplied to debugging activities a different collective way of doing debuggingthat collects shares retrieves information from (previous and current) debug-ging sessions to support (current and future) debugging sessions

3 The Swarm Debugging Approach

Swarm Debugging (SD) uses swarm intelligence applied to interactive debug-ging data to create knowledge for supporting software development activitiesSwarm Debugging works as follows

Swarm Debugging the Collective Intelligence on Interactive Debugging 9

First several developers perform their individual independent debuggingactivities During these activities debugging events are collected by listeners(Label A in Figure 2) for example breakpoints-toggling and stepping events(Label B in Figure 2) that are then stored in a debugging-knowledge reposi-tory (Label C in Figure 2) For accessing this repository services are definedand implemented in the SDI For example stored events are processed bydedicated algorithms (Label D in Figure 2) (1) to create (several types of)visualizations (2) to offer (distinct ways of) searching and (3) to provide rec-ommendations to assist developers during debugging Recommendations arerelated to the locations where to toggle breakpoints Storing and using theseevents allow sharing developersrsquo knowledge among developers creating a col-lective intelligence about the software systems and their debugging

We chose to instrument the Eclipse IDE a popular IDE to implementSwarm Debugging and to reach a large number of users Also we use services inthe cloud to collect the debugging events to process these events and to providevisualizations and recommendations from these events Thus we decoupleddata collection from data usage allowing other researcherstools vendors touse the collected data

During debugging developers analyze the code toggling breakpoints andstepping in and through statements While traditional dynamic analysis ap-proaches collect all interactions states or events SD collects only invocationsexplicitly explored by developers SDI collects only visited areas and paths(chains of invocations by egStep Into or F5 in Eclipse IDE) and thus doesnot suffer from performance or memory issues as omniscient debuggers [29] ortracing-based approaches could

Our decision to record information about breakpoints and stepping is wellsupported by a study from Beller et al [30] A finding of this study is thatsetting breakpoints and stepping through code are the most used debuggingfeatures They showed that most of the recorded debugging events are relatedto the creation (4544) removal (4362) or adjustment of breakpoints hittingthem during debugging and stepping through the source code Furthermoreother advanced debugging features like defining watches and modifying vari-able values have been much less used [30]

4 SDI in a Nutshell

To evaluate the Swarm Debugging approach we have implemented the SwarmDebugging Infrastructure (see httpsgithubcomSwarmDebugging)The Swarm Debugging Infrastructure (SDI) [17] provides a set of tools forcollecting storing sharing retrieving and visualizing data collected duringdevelopersrsquo debugging activities The SDI is an Eclipse IDE11 plug-in inte-grated with Eclipse Debug core It is organized in three main modules (1) theSwarm Debugging Services (2) the Swarm Debugging Tracer and (3) Swarm

11 httpswwweclipseorg

10Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 3 GV elements - Types (nodes) invocations (edge) and Task filter area

Debugging Views All the implementation details of SDI are available in theAppendix section

41 Swarm Debugging Global View

Swarm Debugging Global View (GV) is a call graph for modeling softwarebased on directed call graph [31] to explicit the hierarchical relationship byinvocated methods This visualization use rounded gray boxes (Figure 3-A) torepresent types or classes (nodes) and oriented arrows (Figure 3-B) to expressinvocations (edges) GV is built using previous debugging session context datacollected by developers for different tasks

GV was implemented using CytoscapeJS [32] a Graph API JavaScriptframework applying an automatic layout manager breadthfirst As a web appli-cation the SD visualisations can be integrated into an Eclipse view as an SWTBrowser Widget or accessed through a traditional browser such as MozillaFirefox or Google Chrome

In this view the grey boxes are types that developers visited during debug-ging sessions The edges represent method calls (Step Into or F5 on Eclipse)performed by all developers in all traced tasks on a software project Eachedge colour represents a task and line thickness is proportional to the numberof invocations Each debugging session contributes with a context generat-ing the visualisation combining all collected invocations The visualisation isorganised in layers or stacks and each line is a layer of invocations The start-ing points (non-invoked methods) are allocated on top of a tree the adjacent

Swarm Debugging the Collective Intelligence on Interactive Debugging 11

Fig 4 GV on all tasks

nodes in an invocation sequence Besides developers can directly go to a typein the Eclipse Editor by double-clicking over a node in the diagram In the leftcorner developers can use radio buttons to filter invocations by task (figure 3-C) showing the paths used by developers during previous debugging sessionsby a task Finally developers can use the mouse to pan and zoom inout onthe visualisation Figure 4 shows an example of GV with all tasks for JabRefsystem and we have data about 8 tasks

GV is a contextual visualization that shows only the paths explicitlyand intentionally visited by developers including type declarations andmethod invocations explored by developers based on their decisions

5 Using SDI to Understand Debugging Activities

The first benefit of SDI is the fact that it allows for collecting detailed in-formation about debugging sessions Using this information researchers caninvestigate developers behaviors during debugging activities To illustrate thispoint we conducted two experiments using SDI to understand developers de-bugging habits the times and effort with which they set breakpoints and thelocations where they set breakpoints

Our analysis builds upon three independent sets of observations involvingin total three systems Studies 1 and 2 involved JabRef PDFSaM and Raptoras subject systems We analysed 45 video-recorded debugging sessions avail-able from our own collected videos (Study 1) and an empirical study performedby Jiang et al [33] (Study 2)

In this study we answered the following research questions

RQ1 Is there a correlation between the time of the first breakpoint and a de-bugging taskrsquos elapsed time

RQ2 What is the effort in time for setting the first breakpoint in relation to thedebugging taskrsquos elapsed time

12Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

In this section we elaborate more on each of the studies

51 Study 1 Observational Study on JabRef

511 Subject System

To conduct this first study we selected JabRef12 version 32 as subject sys-tem This choice was motivated by the fact that JabRefrsquos domain is easy tounderstand thus reducing any learning effect It is composed of relatively inde-pendent packages and classes ie high cohesion low coupling thus reducingthe potential commingle effect of low code quality

512 Participants

We recruited eight male professional developers via an Internet-based free-lancer service13 Two participants are experts and three are intermediate inJava Developers self-reported their expertise levels which thus should betaken with caution Also we recruited 12 undergraduate and graduate stu-dents at Polytechnique Montreal to participate in our study We surveyedall the participantsrsquo background information before the study14 The surveyincluded questions about participantsrsquo self-assessment on their level of pro-gramming expertise (Java IDE and Eclipse) gender first natural languageschooling level and knowledge about TDD interactive debugging and whyusually they use a debugger All participants stated that they had experiencein Java and worked regularly with the debugger of Eclipse

513 Task Description

We selected five defects reported in the issue-tracking system of JabRef Wechose the task of fixing the faults that would potentially require developers toset breakpoints in different Java classes To ensure this we manually conductedthe debugging ourselves and verified that for understanding the root causeof the faults we had to set at least two breakpoints during our interactivedebugging sessions Then we asked participants to find the locations of thefaults described in Issues 318 667 669 993 and 1026 Table 1 summarisesthe faults using their titles from the issue-tracking system

12 httpwwwjabreforg13 httpswwwfreelancercom14 Survey available on httpsgooglformsdxCQaBke2l2cqjB42

Swarm Debugging the Collective Intelligence on Interactive Debugging 13

Table 1 Summary of the issues considered in JabRef in Study 1

Issues Summaries

318 ldquoNormalize to Bibtex name formatrdquo

667 ldquohashpound sign causes URL link to failrdquo

669 ldquoJabRef 3132 writes bib file in a format

that it will not readrdquo

993 ldquoIssues in BibTeX source opens save dialog

and opens dialog Problem with parsing entryrsquo

multiple timesrdquo

1026 ldquoJabref removes comments

inside the Bibtex coderdquo

514 Artifacts and Working Environment

We provided the participants with a tutorial15 explaining how to install andconfigure the tools required for the study and how to use them through awarm-up task We also presented a video16 to guide the participants during thewarm-up task In a second document we described the five faults and the stepsto reproduce them We also provided participants with a video demonstratingstep-by-step how to reproduce the five defects to help them get started

We provided a pre-configured Eclipse workspace to the participants andasked them to install Java 8 Eclipse Mars 2 with the Swarm Debugging Tracerplug-in [17] to collect automatically breakpoint-related events The Eclipseworkspace contained two Java projects a Tetris game for the warm-up taskand JabRef v32 for the study We also required that the participants installand configure the Open Broadcaster Software17 (OBS) open-source softwarefor live streaming and recording We used the OBS to record the participantsrsquoscreens

515 Study Procedure

After installing their environments we asked participants to perform a warm-up task with a Tetris game The task consisted of starting a debugging sessionsetting a breakpoint and debugging the Tetris program to locate a givenmethod We used this task to confirm that the participantsrsquo environmentswere properly configured and also to accustom the participants with the studysettings It was a trivial task that we also used to filter the participants whowould have too little knowledge of Java Eclipse and Eclipse Java debugger

15 httpswarmdebuggingorgpublication16 httpsyoutubeU1sBMpfL2jc17 httpsobsprojectcom

14Please give a shorter version with authorrunning and titlerunning prior to maketitle

All participants who participated in our study correctly executed the warm-uptask

After performing the warm-up task each participant performed debuggingto locate the faults We established a maximum limit of one-hour per task andinformed the participants that the task would require about 20 minutes foreach fault which we will discuss as a possible threat to validity We based thislimit on previous experiences with these tasks during mock trials After theparticipants performed each task we asked them to answer a post-experimentquestionnaire to collect information about the study asking if they found thefaults where were the faults why the faults happened if they were tired anda general summary of their debugging experience

516 Data Collection

The Swarm Debugging Tracer plug-in automatically and transparently col-lected all debugging data (breakpoints stepping method invocations) Alsowe recorded the participantrsquos screens during their debugging sessions withOBS We collected the following data

ndash 28 video recordings one per participant and task which are essential tocontrol the quality of each session and to produce a reliable and repro-ducible chain of evidence for our results

ndash The statements (lines in the source code) where the participants set break-points We considered the following types of statements because they arerepresentative of the main concepts in any programming languagesndash call methodfunction invocationsndash return returns of valuesndash assignment settings of valuesndash if-statement conditional statementsndash while-loop loops iterations

ndash Summaries of the results of the study one per participant via a question-naire which included the following questionsndash Did you locate the faultndash Where was the faultndash Why did the fault happenndash Were you tiredndash How was your debugging experience

Based on this data we obtained or computed the following metrics perparticipant and task

ndash Start Time (ST ) the timestamp when the participant started a task Weanalysed each video and we started to count when effectively the partic-ipant started a task ie when she started the Swarm Debugging Tracerplug-in for example

ndash Time of First Breakpoint (FB) the time when the participant set her firstbreakpoint

ndash End time (T ) the time when the participant finished a task

Swarm Debugging the Collective Intelligence on Interactive Debugging 15

ndash Elapsed End time (ET ) ET = T minus STndash Elapsed Time First Breakpoint (EF ) EF = FB minus ST

We manually verified whether participants were successful or not at com-pleting their tasks by analysing the answers provided in the questionnaireand the videos We knew the locations of the faults because all tasks weresolved by JabRefrsquos developers who completed the corresponding reports inthe issue-tracking system with the changes that they made

52 Study 2 Empirical Study on PDFSaM and Raptor

The second study consisted of the re-analysis of 20 videos of debugging sessionsavailable from an empirical study on change-impact analysis with professionaldevelopers [33] The authors conducted their work in two phases In the firstphase they asked nine developers to read two fault reports from two open-source systems and to fix these faults The objective was to observe the devel-opersrsquo behaviour as they fixed the faults In the second phase they analysedthe developersrsquo behaviour to determine whether the developers used any toolsfor change-impact analysis and if not whether they performed change-impactanalysis manually

The two systems analysed in their study are PDF Split and Merge18 (PDF-SaM) and Raptor19 They chose one fault report per system for their studyThey chose these systems due to their non-trivial size and because the pur-poses and domains of these systems were clear and easy to understand [33]The choice of the fault reports followed the criteria that they were alreadysolved and that they could be understood by developers who did not knowthe systems Alongside each fault report they presented the developers withinformation about the systems their purpose their main entry points andinstructions for replicating the faults

53 Results

As can be noticed Studies 1 and 2 have different approaches The tasks inStudy 1 were fault location tasks developers did not correct the faults whilethe ones in Study 2 were fault correction tasks Moreover Study 1 exploredfive different faults while Study 2 only analysed one fault per system Thecollected data provide a diversity of cases and allow a rich in-depth view ofhow developers set breakpoints during different debugging sessions

In the following we present the results regarding each research questionaddressed in the two studies

18 httpwwwpdfsamorg19 httpscodegooglecompraptor-chess-interface

16Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

MFB =EF

ET(1)

Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

Tasks Average Times (min) Std Devs (min)

318 44 64

667 28 29

669 22 25

993 25 25

1026 25 17

PdfSam 54 18

Raptor 59 13

Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

Swarm Debugging the Collective Intelligence on Interactive Debugging 17

RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

f(x) =α

xβ(2)

where α = 12 and β = 044

Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

This finding also corroborates previous results found with a different set oftasks [17]

18Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

Table 3 Study 1 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 111 53

if-statement 39 19

assignment 36 17

return 18 10

while-loop 3 1

Table 4 Study 2 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 43 43

if-statement 22 22

assignment 27 27

return 4 4

while-loop 4 4

13

Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

Swarm Debugging the Collective Intelligence on Interactive Debugging 19

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

JabRefDesktopopenExternalViewer(metaData()

linktoString() field)

Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

Tasks Classes Lines of Code Breakpoints

0318 AuthorsFormatter 43 5

0318 AuthorsFormatter 131 3

0667 BasePanel 935 2

0667 BasePanel 969 3

0667 JabRefDesktop 430 2

0669 OpenDatabaseAction 268 2

0669 OpenDatabaseAction 433 4

0669 OpenDatabaseAction 451 4

0993 EntryEditor 717 2

0993 EntryEditor 720 2

0993 EntryEditor 723 2

0993 BibDatabase 187 2

0993 BibDatabase 456 2

1026 EntryEditor 1184 2

1026 BibtexParser 160 2

20Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

Classes Lines of Code Breakpoints

PdfReader 230 2

PdfReader 806 2

PdfReader 1923 2

ConsoleServicesFacade 89 2

ConsoleClient 81 2

PdfUtility 94 2

PdfUtility 96 2

PdfUtility 102 2

Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

Classes Lines of Code Breakpoints

icsUtils 333 3

Game 1751 2

ExamineController 41 2

ExamineController 84 3

ExamineController 87 2

ExamineController 92 2

When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

Swarm Debugging the Collective Intelligence on Interactive Debugging 21

Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

Classes Lines of Code Breakpoints

BibtexParser 138151159 222

160165168 323

176198199299 2222

EntryEditor 717720721 342

723837842 232

11841393 32

BibDatabase 175187223456 2326

OpenDatabaseAction 433450451 424

JabRefDesktop 4084430 223

SaveDatabaseAction 177188 42

BasePanel 935969 25

AuthorsFormatter 43131 54

EntryTableTransferHandler 346 2

FieldTextMenu 84 2

JabRefFrame 1119 2

JabRefMain 8 5

URLUtil 95 2

Fig 6 Methods with 5 or more breakpoints

Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

22Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 9 Study 1 - Breakpoints by class across different tasks

Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

SaveDatabaseAction Yes Yes Yes 7 2

BasePanel Yes Yes Yes Yes 14 7

JabRefDesktop Yes Yes 9 4

EntryEditor Yes Yes Yes 36 4

BibtexParser Yes Yes Yes 44 6

OpenDatabaseAction Yes Yes Yes 19 13

JabRef Yes Yes Yes 3 3

JabRefMain Yes Yes Yes Yes 5 4

URLUtil Yes Yes 4 2

BibDatabase Yes Yes Yes 19 4

points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

6 Evaluation of Swarm Debugging using GV

To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

Swarm Debugging the Collective Intelligence on Interactive Debugging 23

61 Study design

The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

611 Subject System

For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

612 Participants

Fig 7 Java expertise

To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

20 httpwwwjabreforg21 httpswwwfreelancercom

24Please give a shorter version with authorrunning and titlerunning prior to maketitle

613 Task Description

We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

614 Artifacts and Working Environment

After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

615 Study Procedure

The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

22 The full qualitative evaluation survey is available on httpsgooglforms

c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

Swarm Debugging the Collective Intelligence on Interactive Debugging 25

group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

616 Data Collection

In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

62 Results

We now discuss the results of our evaluation

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

26Please give a shorter version with authorrunning and titlerunning prior to maketitle

number of participants who could propose a solution and the correctness ofthe solutions

For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

Fig 8 GV for Task 0318

For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

Fig 9 GV for Task 0667

Swarm Debugging the Collective Intelligence on Interactive Debugging 27

Fig 10 GV for Task 0669

Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

13

Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

28Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 11 GV usefulness - experimental phase one

Fig 12 GV usefulness - experimental phase two

The analysis of our results suggests that GV is useful to support software-maintenance tasks

Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

Swarm Debugging the Collective Intelligence on Interactive Debugging 29

Table 10 Results from control and experimental groups (average)

Task 0993

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000255 000340 -44 126

Time to start 000444 000518 -33 112

Elapsed time 003008 001605 843 53

Task 1026

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000242 000448 -126 177

Time to start 000402 000343 19 92

Elapsed time 002458 002041 257 83

63 Comparing Results from the Control and Experimental Groups

We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 2: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

2Please give a shorter version with authorrunning and titlerunning prior to maketitle

Debugging is a common activity during software development mainte-nance and evolution [1] Developers use debugging tools to detect locateand correct faults Debugging tools can be interactive or automated

Interactive debugging tools aka debuggers such as sdb [2] dbx [3] orgdb [4] have been used by developers for decades Modern debuggers are of-ten integrated in interactive environments eg DDD [5] or the debuggers ofEclipse NetBeans IntelliJ IDEA and Visual Studio They allow developersto navigate through the code look for locations to place breakpoints and stepoverinto statements While stepping debuggers can traverse method invoca-tions and allow developers to toggle one or more breakpoints and stoprestartexecutions Thus they allow developers to gain knowledge about programsand the causes of faults to fix them

Automated debugging tools require both successful and failed runs and donot support programs with interactive inputs [6] Consequently they have notbeen widely adopted in practice Moreover automated debugging approachesare often unable to indicate the ldquotruerdquo locations of faults [7] Other hybridtools such as slicing and query languages may help developers but there isinsufficient evidence that they help developers during debugging

Although Integrated Development Environments (IDEs) encourage devel-opers to work collaboratively exchanging code through Git or assessing codequality with SonarQube one activity remains solitary debugging Debuggingis still an individual activity during which a developer explores the sourcecode of the system under development or maintenance using the debugger pro-vided by an IDE She steps into hundreds of statements and traverses dozensof method invocations painstakingly to gain an understanding of the systemMoreover within modern interactive debugging tools such as those includedin Eclipse or IntelliJ a debugging session cannot start if the developer does notset a breakpoint Consequently it is mandatory to set at least one breakpointto launch an interactive debugging session

Several studies have shown that developers spend over two-thirds of theirtime investigating code and one-third of this time is spent in debugging [8910] However developers do not reuse the knowledge accumulated duringdebugging directly When debugging is over they loose track of the pathsthat they followed into the code and of the breakpoints that they toggledMoreover they cannot share this knowledge with other developers easily If afault re-appears in the system or if a new fault similar to a previous one islogged the developer must restart the exploration from the beginning

In fact debugging tools have not changed substantially in the last 30 yearsdevelopersrsquo primary tools for debugging their programs are still breakpoint de-buggers and print statements Indeed changing the way developers debug theirprograms is one of the main motivations of our work We are convinced thata collaborative way of using contextual information of (previous) debuggingsessions to support (future) debugging activities is a very interesting approach

Roszligler [7] advocated for the development of a new family of debuggingtools that use contextual information To build context-aware debugging toolsresearchers need an understanding of developersrsquo debugging sessions to use

Swarm Debugging the Collective Intelligence on Interactive Debugging 3

this information as context for their debugging Thus researchers need toolsto collect and share data about developersrsquo debugging sessions

Maalej et al [11] observed that capturing contextual information requiresthe instrumentation of the IDE and continuous observation of the developersrsquoactivities within the IDE Studies by Storey et al [12] showed that the newergeneration of developers who are proficient in social media are comfortablewith sharing such information Developers are nowadays open transparenteager to share their knowledge and generally willing to allow informationabout their activities to be collected by the IDEs automatically [12]

Considering this context we introduce the concept of Swarm Debug-ging (SD) to (1) capture debugging contextual information (2) share it and(3) reuse it across debugging sessions and developers We build the concept ofSwarm Debugging based on the idea that many developers performing debug-ging sessions independently are in fact building collective knowledge whichcan be shared and reused with adequate support Thus we are convinced thatdevelopers need support to collect store and share this knowledge ie in-formation from and about their debugging sessions including but not limitedto breakpoints locations visited statements and traversed paths To providesuch support Swarm Debugging includes (i) the Swarm Debugging Infrastruc-ture (SDI) with which practitioners and researchers can collect and share dataabout developersrsquo interactive debugging sessions and (ii) the Swarm Debug-ging Global View (GV) to display debugging paths

As a consequence of adopting SD an interesting question emerges whatdebugging information is useful to share among developers to ease debuggingDebugging provides a lot of information which could be possibly considereduseful to improve software comprehension but we are particularly interestedin two pieces of debugging information breakpoints (and their locations) andsessions (debugging paths) because these pieces of information are essentialfor the two main activities during debugging setting breakpoints and steppinginoverout statements

In general developers initiate an interactive debugging session by settinga breakpoint Setting a breakpoint is one of the most frequently used fea-tures of IDEs [13] To decide where to set a breakpoint developers use theirobservations recall their experiences with similar debugging tasks and formu-late hypotheses about their tasks [14] Tiarks and Rohms [15] observed thatdevelopers have difficulties in finding locations for setting the breakpointssuggesting that this is a demanding activity and that supporting developersto set appropriate breakpoints could reduce debugging effort

We conducted two sets of studies with the aim of understanding how de-velopers set breakpoints and navigate (step) during debugging sessions Inobservational studies we collected and analyzed more than 10 hours of devel-opersrsquo videos in 45 debugging sessions performed by 28 different independentdevelopers containing 307 breakpoints on three software systems These ob-servational studies help us understand how developers use breakpoints (RQ1to RQ4)

4Please give a shorter version with authorrunning and titlerunning prior to maketitle

We also conducted with 30 professional developers two studies a qualitativeevaluation and a controlled experiment to assess whether debugging sessionsshared through our Global View visualisation support developers in theirdebugging tasks and is useful for sharing debugging tasks among developers(R5 and RQ6) We collected participantsrsquo answers in electronic forms and morethan 3 hours of debugging sessions on video

This paper has the following contributions

ndash We introduce a novel approach for debugging named Swarm Debugging(SD) based on the concept of Swarm Intelligence and Information ForagingTheory

ndash We present an infrastructure the Swarm Debugging Infrastructure (SDI)to gather store and share data about interactive debugging activities tosupport SD

ndash We provide evidence about the relation between tasksrsquo elapsed time de-velopersrsquo expertise breakpoints setting and debugging patterns

ndash We present a new visualisation technique Global View (GV) built onshared debugging sessions by developers to ease debugging

ndash We provide evidence about the usefulness of sharing debugging session toease developersrsquo debugging

This paper extends our previous works [161718] as follows First we sum-marize the main characteristics of the Swarm Debugging approach providinga theoretical foundation to Swarm Debugging using Swarm Intelligence andInformation Foraging Theory Second we present the Swarm Debugging Infras-tructure (SDI) Third we perform an experiment on the debugging behaviorof 30 professional developers to evaluate if sharing debugging sessions supportsadequately their debugging tasks

The remainder of this article is organized as follows Section 2 providessome fundamentals of debugging and the foundations of SD the conceptsof swarm intelligence and information foraging theory Section 3 describesour approach and its implementation the Swarm Debugging InfrastructureSection 6 presents an experiment to assess the benefits that our SD approachcan bring to developers and Section 5 reports two experiments that wereconducted using SDI to understand developers debugging habits Next Section7 discusses implications of our results while Section 8 presents threats to thevalidity of our study Section 9 summarizes related work and finally Section10 concludes the paper and outlines future work

2 Background

This section provides background information about the debugging activityand setting breakpoints In the following we use failures as unintended be-haviours of a program ie when the program does something that it shouldnot and faults as the incorrect statements in source code causing failuresThe purpose of debugging is to locate and correct faults hence to fix failures

Swarm Debugging the Collective Intelligence on Interactive Debugging 5

21 Debugging and Interactive Debugging

The IEEE Standard Glossary of Software Engineering Terminology (see thedefinition at the beginning of Section 1) defines debugging as the act of de-tecting locating and correcting bugs in a computer program Debugging tech-niques include the use of breakpoints desk checking dumps inspection re-versible execution single-step operations and traces

Araki et al [19] describe debugging as a process where developers makehypotheses about the root-cause of a problem or defect and verify these hy-potheses by examining different parts of the source code of the program

Interactive debugging consists of using a tool ie a debugger to detectlocate and correct a fault in a program It is a process also known as programanimation stepping or following execution [20] Developers often refer to thisprocess simply as debugging because several IDEs provide debuggers to sup-port debugging However it must be noted that while debugging is the processof finding faults interactive debugging is one particular debugging approachin which developers use interactive tools Expressions such as interactive de-bugging stepping and debugging are used interchangeably and there is not yeta consensus on what is the best name for this process

22 Breakpoints and Supporting Mechanisms

Generally breakpoints allow pausing intentionally the execution of a programfor debugging purposes a means of acquiring knowledge about a program dur-ing its execution for example to examine the call stack and variable valueswhen the control flow reaches the locations of the breakpoints Thus a break-point indicates the location (line) in the source code of a program where apause occurs during its execution

Depending on the programming language its run-time environment (inparticular the capabilities of its virtual machines if any) and the debuggersdifferent types of breakpoints may be available to developers These types in-clude static breakpoints [21] that pause unconditionally the execution of aprogram and dynamic breakpoints [22] that pause depending on some con-ditions or threads or numbers of hits

Other types of breakpoints include watchpoints that pause the executionwhen a variable being watched is read andndashor written IDEs offer the meansto specify the different types of breakpoints depending on the programminglanguages and their run-time environment Fig 1-A and 1-B show examples ofstatic and dynamic breakpoints in Eclipse In the rest of this paper we focuson static breakpoints because they are the most used of all types [14]

There are different mechanisms for setting a breakpoint within the code

6Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 1 Setting a static breakpoint (A) and a conditional breakpoint (B)using Eclipse IDE

ndash GUI Most IDEs or browsers offer a visual way of adding a breakpoint usu-ally by clicking at the beginning of the line on which to set the breakpointChrome1 Visual Studio2 IntelliJ 3 and Xcode4

ndash Command line Some programming languages offer debugging tools on thecommand line so an IDE is not necessary to debug the code JDB5 PDB6and GDB7

ndash Code Some programming languages allow using syntactical elements to setbreakpoints as they were lsquoannotationsrsquo in the code This approach oftenonly supports the setting of a breakpoint and it is necessary to use itin conjunction with the command line or GUI Some examples are Rubydebugger8 Firefox 9 and Chrome10

There is a set of features in a debugger that allows developers to control theflow of the execution within the breakpoints ie Call Stack features whichenable continuing or stepping

A developer can opt for continuing in which case the debugger resumesexecution until the next breakpoint is reached or the program exits Con-versely stepping allows the developer to run step by step the entire program

1httpsdevelopersgooglecomwebtoolschrome-devtoolsjavascriptadd-breakpoints

2httpsmsdnmicrosoftcomen-uslibrary5557y8b4aspx

3httpswwwjetbrainscomhelpidea20163debugger-basicshtml

4httpjeffreysambellscom20140114using-breakpoints-in-xcode

5httpdocsoraclecomjavase7docstechnotestoolswindowsjdbhtml

6httpsdocspythonorg2librarypdbhtml

7ftpftpgnuorgoldgnuManualsgdb511html nodegdb 37html

8httpsgithubcomcldwalkerdebugger

9httpsdevelopermozillaorgpt-BRdocsWebJavaScriptReferenceStatementsdebugger

10httpsdevelopersgooglecomwebtoolschrome-devtoolsjavascriptadd-breakpoints

Swarm Debugging the Collective Intelligence on Interactive Debugging 7

flow The definition of a step varies across programming languages and debug-gers but it generally includes invoking a method and executing a statementWhile Stepping a developer can navigate between steps using the followingcommands

ndash Step Over the debugger steps over a given line If the line contains afunction then the function is executed and the result returned withoutstepping through each of its lines

ndash Step Into the debugger enters the function at the current line and continuestepping from there line-by-line

ndash Step Out this action would take the debugger back to the line where thecurrent function was called

To start an interactive debugging session developers set a breakpoint Ifnot the IDE would not stop and enter its interactive mode For exampleEclipse IDE automatically opens the ldquoDebugging Perspectiverdquo when executionhits a breakpoint A developer can run a system in debugging mode withoutsetting breakpoints but she must set a breakpoint to be able to stop theexecution step in and observe variable states Briefly there is no interactivedebugging session without at least one breakpoint set in the codeFinally some debuggers allow debugging remotely for example to performhot-fixes or to test mobile applications and systems operating in remote con-figurations

23 Self-organization and Swarm Intelligence

Self-organization is a concept emerged from Social Sciences and Biology and itis defined as the set of dynamic mechanisms enabling structures to appear atthe global level of a system from interactions among its lower-level componentswithout being explicitly coded at the lower levels Swarm intelligence (SI)describes the behavior resulting from the self-organization of social agents(as insects) [23] Ant nests and the societies that they house are examples ofSI [24] Individual ants can only perform relatively simple activities yet thewhole colony can collectively accomplish sophisticated activities Ants achieveSI by exchanging information encoded as chemical signalsmdashpheromones egindicating a path to follow or an obstacle to avoid

Similarly SI could be used as a metaphor to understand or explain thedevelopment of a multiversion large and complex software systems built bysoftware teams Individual developers can usually perform activities withouthaving a global understanding of the whole system [25] In a birdrsquos eye viewsoftware development is analogous to some SI in which groups of agents in-teracting locally with one another and with their environment and follow-ing simple rules lead to the emergence of global behaviors previously un-knownimpossible to the individual agents We claim that the similarities be-tween the SI of ant nests and complex software systems are not a coincidenceCockburn [26] suggested that the best architectures requirements and designs

8Please give a shorter version with authorrunning and titlerunning prior to maketitle

emerge from self-organizing developers growing in steps and following theirchanging knowledge and the changing wishes of the user community ie atypical example of swarm intelligence

Dev1

Dev2

Dev3

DevN

VisualisationsSearching Tools

Recommendation Systems

Single Debugging Session Crowd Debugging Sessions Debugging Information

Positive feedback

Collect data Store data

Transform information

A B C

D

Fig 2 Overview of the Swarm Debugging approach

24 Information Foraging

Information Foraging Theory (IFT) is based on the optimal foraging theorydeveloped by Pirolli and Card [27] to understand how people search for infor-mation IFT is rooted in biology studies and theories of how animals hunt forfood It was extended to debugging by Lawrance et al[27]

However no previous work proposes the sharing of knowledge related todebugging activities Differently from works that use IFT on a model onepreyone predator [28] we are interested in many developers working inde-pendently in many debugging sessions and sharing information to allow SI toemerge Thus debugging becomes a foraging process in a SI environment

These conceptsmdashSI and IFTmdashhave led to the design of a crowd approachapplied to debugging activities a different collective way of doing debuggingthat collects shares retrieves information from (previous and current) debug-ging sessions to support (current and future) debugging sessions

3 The Swarm Debugging Approach

Swarm Debugging (SD) uses swarm intelligence applied to interactive debug-ging data to create knowledge for supporting software development activitiesSwarm Debugging works as follows

Swarm Debugging the Collective Intelligence on Interactive Debugging 9

First several developers perform their individual independent debuggingactivities During these activities debugging events are collected by listeners(Label A in Figure 2) for example breakpoints-toggling and stepping events(Label B in Figure 2) that are then stored in a debugging-knowledge reposi-tory (Label C in Figure 2) For accessing this repository services are definedand implemented in the SDI For example stored events are processed bydedicated algorithms (Label D in Figure 2) (1) to create (several types of)visualizations (2) to offer (distinct ways of) searching and (3) to provide rec-ommendations to assist developers during debugging Recommendations arerelated to the locations where to toggle breakpoints Storing and using theseevents allow sharing developersrsquo knowledge among developers creating a col-lective intelligence about the software systems and their debugging

We chose to instrument the Eclipse IDE a popular IDE to implementSwarm Debugging and to reach a large number of users Also we use services inthe cloud to collect the debugging events to process these events and to providevisualizations and recommendations from these events Thus we decoupleddata collection from data usage allowing other researcherstools vendors touse the collected data

During debugging developers analyze the code toggling breakpoints andstepping in and through statements While traditional dynamic analysis ap-proaches collect all interactions states or events SD collects only invocationsexplicitly explored by developers SDI collects only visited areas and paths(chains of invocations by egStep Into or F5 in Eclipse IDE) and thus doesnot suffer from performance or memory issues as omniscient debuggers [29] ortracing-based approaches could

Our decision to record information about breakpoints and stepping is wellsupported by a study from Beller et al [30] A finding of this study is thatsetting breakpoints and stepping through code are the most used debuggingfeatures They showed that most of the recorded debugging events are relatedto the creation (4544) removal (4362) or adjustment of breakpoints hittingthem during debugging and stepping through the source code Furthermoreother advanced debugging features like defining watches and modifying vari-able values have been much less used [30]

4 SDI in a Nutshell

To evaluate the Swarm Debugging approach we have implemented the SwarmDebugging Infrastructure (see httpsgithubcomSwarmDebugging)The Swarm Debugging Infrastructure (SDI) [17] provides a set of tools forcollecting storing sharing retrieving and visualizing data collected duringdevelopersrsquo debugging activities The SDI is an Eclipse IDE11 plug-in inte-grated with Eclipse Debug core It is organized in three main modules (1) theSwarm Debugging Services (2) the Swarm Debugging Tracer and (3) Swarm

11 httpswwweclipseorg

10Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 3 GV elements - Types (nodes) invocations (edge) and Task filter area

Debugging Views All the implementation details of SDI are available in theAppendix section

41 Swarm Debugging Global View

Swarm Debugging Global View (GV) is a call graph for modeling softwarebased on directed call graph [31] to explicit the hierarchical relationship byinvocated methods This visualization use rounded gray boxes (Figure 3-A) torepresent types or classes (nodes) and oriented arrows (Figure 3-B) to expressinvocations (edges) GV is built using previous debugging session context datacollected by developers for different tasks

GV was implemented using CytoscapeJS [32] a Graph API JavaScriptframework applying an automatic layout manager breadthfirst As a web appli-cation the SD visualisations can be integrated into an Eclipse view as an SWTBrowser Widget or accessed through a traditional browser such as MozillaFirefox or Google Chrome

In this view the grey boxes are types that developers visited during debug-ging sessions The edges represent method calls (Step Into or F5 on Eclipse)performed by all developers in all traced tasks on a software project Eachedge colour represents a task and line thickness is proportional to the numberof invocations Each debugging session contributes with a context generat-ing the visualisation combining all collected invocations The visualisation isorganised in layers or stacks and each line is a layer of invocations The start-ing points (non-invoked methods) are allocated on top of a tree the adjacent

Swarm Debugging the Collective Intelligence on Interactive Debugging 11

Fig 4 GV on all tasks

nodes in an invocation sequence Besides developers can directly go to a typein the Eclipse Editor by double-clicking over a node in the diagram In the leftcorner developers can use radio buttons to filter invocations by task (figure 3-C) showing the paths used by developers during previous debugging sessionsby a task Finally developers can use the mouse to pan and zoom inout onthe visualisation Figure 4 shows an example of GV with all tasks for JabRefsystem and we have data about 8 tasks

GV is a contextual visualization that shows only the paths explicitlyand intentionally visited by developers including type declarations andmethod invocations explored by developers based on their decisions

5 Using SDI to Understand Debugging Activities

The first benefit of SDI is the fact that it allows for collecting detailed in-formation about debugging sessions Using this information researchers caninvestigate developers behaviors during debugging activities To illustrate thispoint we conducted two experiments using SDI to understand developers de-bugging habits the times and effort with which they set breakpoints and thelocations where they set breakpoints

Our analysis builds upon three independent sets of observations involvingin total three systems Studies 1 and 2 involved JabRef PDFSaM and Raptoras subject systems We analysed 45 video-recorded debugging sessions avail-able from our own collected videos (Study 1) and an empirical study performedby Jiang et al [33] (Study 2)

In this study we answered the following research questions

RQ1 Is there a correlation between the time of the first breakpoint and a de-bugging taskrsquos elapsed time

RQ2 What is the effort in time for setting the first breakpoint in relation to thedebugging taskrsquos elapsed time

12Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

In this section we elaborate more on each of the studies

51 Study 1 Observational Study on JabRef

511 Subject System

To conduct this first study we selected JabRef12 version 32 as subject sys-tem This choice was motivated by the fact that JabRefrsquos domain is easy tounderstand thus reducing any learning effect It is composed of relatively inde-pendent packages and classes ie high cohesion low coupling thus reducingthe potential commingle effect of low code quality

512 Participants

We recruited eight male professional developers via an Internet-based free-lancer service13 Two participants are experts and three are intermediate inJava Developers self-reported their expertise levels which thus should betaken with caution Also we recruited 12 undergraduate and graduate stu-dents at Polytechnique Montreal to participate in our study We surveyedall the participantsrsquo background information before the study14 The surveyincluded questions about participantsrsquo self-assessment on their level of pro-gramming expertise (Java IDE and Eclipse) gender first natural languageschooling level and knowledge about TDD interactive debugging and whyusually they use a debugger All participants stated that they had experiencein Java and worked regularly with the debugger of Eclipse

513 Task Description

We selected five defects reported in the issue-tracking system of JabRef Wechose the task of fixing the faults that would potentially require developers toset breakpoints in different Java classes To ensure this we manually conductedthe debugging ourselves and verified that for understanding the root causeof the faults we had to set at least two breakpoints during our interactivedebugging sessions Then we asked participants to find the locations of thefaults described in Issues 318 667 669 993 and 1026 Table 1 summarisesthe faults using their titles from the issue-tracking system

12 httpwwwjabreforg13 httpswwwfreelancercom14 Survey available on httpsgooglformsdxCQaBke2l2cqjB42

Swarm Debugging the Collective Intelligence on Interactive Debugging 13

Table 1 Summary of the issues considered in JabRef in Study 1

Issues Summaries

318 ldquoNormalize to Bibtex name formatrdquo

667 ldquohashpound sign causes URL link to failrdquo

669 ldquoJabRef 3132 writes bib file in a format

that it will not readrdquo

993 ldquoIssues in BibTeX source opens save dialog

and opens dialog Problem with parsing entryrsquo

multiple timesrdquo

1026 ldquoJabref removes comments

inside the Bibtex coderdquo

514 Artifacts and Working Environment

We provided the participants with a tutorial15 explaining how to install andconfigure the tools required for the study and how to use them through awarm-up task We also presented a video16 to guide the participants during thewarm-up task In a second document we described the five faults and the stepsto reproduce them We also provided participants with a video demonstratingstep-by-step how to reproduce the five defects to help them get started

We provided a pre-configured Eclipse workspace to the participants andasked them to install Java 8 Eclipse Mars 2 with the Swarm Debugging Tracerplug-in [17] to collect automatically breakpoint-related events The Eclipseworkspace contained two Java projects a Tetris game for the warm-up taskand JabRef v32 for the study We also required that the participants installand configure the Open Broadcaster Software17 (OBS) open-source softwarefor live streaming and recording We used the OBS to record the participantsrsquoscreens

515 Study Procedure

After installing their environments we asked participants to perform a warm-up task with a Tetris game The task consisted of starting a debugging sessionsetting a breakpoint and debugging the Tetris program to locate a givenmethod We used this task to confirm that the participantsrsquo environmentswere properly configured and also to accustom the participants with the studysettings It was a trivial task that we also used to filter the participants whowould have too little knowledge of Java Eclipse and Eclipse Java debugger

15 httpswarmdebuggingorgpublication16 httpsyoutubeU1sBMpfL2jc17 httpsobsprojectcom

14Please give a shorter version with authorrunning and titlerunning prior to maketitle

All participants who participated in our study correctly executed the warm-uptask

After performing the warm-up task each participant performed debuggingto locate the faults We established a maximum limit of one-hour per task andinformed the participants that the task would require about 20 minutes foreach fault which we will discuss as a possible threat to validity We based thislimit on previous experiences with these tasks during mock trials After theparticipants performed each task we asked them to answer a post-experimentquestionnaire to collect information about the study asking if they found thefaults where were the faults why the faults happened if they were tired anda general summary of their debugging experience

516 Data Collection

The Swarm Debugging Tracer plug-in automatically and transparently col-lected all debugging data (breakpoints stepping method invocations) Alsowe recorded the participantrsquos screens during their debugging sessions withOBS We collected the following data

ndash 28 video recordings one per participant and task which are essential tocontrol the quality of each session and to produce a reliable and repro-ducible chain of evidence for our results

ndash The statements (lines in the source code) where the participants set break-points We considered the following types of statements because they arerepresentative of the main concepts in any programming languagesndash call methodfunction invocationsndash return returns of valuesndash assignment settings of valuesndash if-statement conditional statementsndash while-loop loops iterations

ndash Summaries of the results of the study one per participant via a question-naire which included the following questionsndash Did you locate the faultndash Where was the faultndash Why did the fault happenndash Were you tiredndash How was your debugging experience

Based on this data we obtained or computed the following metrics perparticipant and task

ndash Start Time (ST ) the timestamp when the participant started a task Weanalysed each video and we started to count when effectively the partic-ipant started a task ie when she started the Swarm Debugging Tracerplug-in for example

ndash Time of First Breakpoint (FB) the time when the participant set her firstbreakpoint

ndash End time (T ) the time when the participant finished a task

Swarm Debugging the Collective Intelligence on Interactive Debugging 15

ndash Elapsed End time (ET ) ET = T minus STndash Elapsed Time First Breakpoint (EF ) EF = FB minus ST

We manually verified whether participants were successful or not at com-pleting their tasks by analysing the answers provided in the questionnaireand the videos We knew the locations of the faults because all tasks weresolved by JabRefrsquos developers who completed the corresponding reports inthe issue-tracking system with the changes that they made

52 Study 2 Empirical Study on PDFSaM and Raptor

The second study consisted of the re-analysis of 20 videos of debugging sessionsavailable from an empirical study on change-impact analysis with professionaldevelopers [33] The authors conducted their work in two phases In the firstphase they asked nine developers to read two fault reports from two open-source systems and to fix these faults The objective was to observe the devel-opersrsquo behaviour as they fixed the faults In the second phase they analysedthe developersrsquo behaviour to determine whether the developers used any toolsfor change-impact analysis and if not whether they performed change-impactanalysis manually

The two systems analysed in their study are PDF Split and Merge18 (PDF-SaM) and Raptor19 They chose one fault report per system for their studyThey chose these systems due to their non-trivial size and because the pur-poses and domains of these systems were clear and easy to understand [33]The choice of the fault reports followed the criteria that they were alreadysolved and that they could be understood by developers who did not knowthe systems Alongside each fault report they presented the developers withinformation about the systems their purpose their main entry points andinstructions for replicating the faults

53 Results

As can be noticed Studies 1 and 2 have different approaches The tasks inStudy 1 were fault location tasks developers did not correct the faults whilethe ones in Study 2 were fault correction tasks Moreover Study 1 exploredfive different faults while Study 2 only analysed one fault per system Thecollected data provide a diversity of cases and allow a rich in-depth view ofhow developers set breakpoints during different debugging sessions

In the following we present the results regarding each research questionaddressed in the two studies

18 httpwwwpdfsamorg19 httpscodegooglecompraptor-chess-interface

16Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

MFB =EF

ET(1)

Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

Tasks Average Times (min) Std Devs (min)

318 44 64

667 28 29

669 22 25

993 25 25

1026 25 17

PdfSam 54 18

Raptor 59 13

Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

Swarm Debugging the Collective Intelligence on Interactive Debugging 17

RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

f(x) =α

xβ(2)

where α = 12 and β = 044

Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

This finding also corroborates previous results found with a different set oftasks [17]

18Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

Table 3 Study 1 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 111 53

if-statement 39 19

assignment 36 17

return 18 10

while-loop 3 1

Table 4 Study 2 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 43 43

if-statement 22 22

assignment 27 27

return 4 4

while-loop 4 4

13

Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

Swarm Debugging the Collective Intelligence on Interactive Debugging 19

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

JabRefDesktopopenExternalViewer(metaData()

linktoString() field)

Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

Tasks Classes Lines of Code Breakpoints

0318 AuthorsFormatter 43 5

0318 AuthorsFormatter 131 3

0667 BasePanel 935 2

0667 BasePanel 969 3

0667 JabRefDesktop 430 2

0669 OpenDatabaseAction 268 2

0669 OpenDatabaseAction 433 4

0669 OpenDatabaseAction 451 4

0993 EntryEditor 717 2

0993 EntryEditor 720 2

0993 EntryEditor 723 2

0993 BibDatabase 187 2

0993 BibDatabase 456 2

1026 EntryEditor 1184 2

1026 BibtexParser 160 2

20Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

Classes Lines of Code Breakpoints

PdfReader 230 2

PdfReader 806 2

PdfReader 1923 2

ConsoleServicesFacade 89 2

ConsoleClient 81 2

PdfUtility 94 2

PdfUtility 96 2

PdfUtility 102 2

Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

Classes Lines of Code Breakpoints

icsUtils 333 3

Game 1751 2

ExamineController 41 2

ExamineController 84 3

ExamineController 87 2

ExamineController 92 2

When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

Swarm Debugging the Collective Intelligence on Interactive Debugging 21

Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

Classes Lines of Code Breakpoints

BibtexParser 138151159 222

160165168 323

176198199299 2222

EntryEditor 717720721 342

723837842 232

11841393 32

BibDatabase 175187223456 2326

OpenDatabaseAction 433450451 424

JabRefDesktop 4084430 223

SaveDatabaseAction 177188 42

BasePanel 935969 25

AuthorsFormatter 43131 54

EntryTableTransferHandler 346 2

FieldTextMenu 84 2

JabRefFrame 1119 2

JabRefMain 8 5

URLUtil 95 2

Fig 6 Methods with 5 or more breakpoints

Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

22Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 9 Study 1 - Breakpoints by class across different tasks

Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

SaveDatabaseAction Yes Yes Yes 7 2

BasePanel Yes Yes Yes Yes 14 7

JabRefDesktop Yes Yes 9 4

EntryEditor Yes Yes Yes 36 4

BibtexParser Yes Yes Yes 44 6

OpenDatabaseAction Yes Yes Yes 19 13

JabRef Yes Yes Yes 3 3

JabRefMain Yes Yes Yes Yes 5 4

URLUtil Yes Yes 4 2

BibDatabase Yes Yes Yes 19 4

points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

6 Evaluation of Swarm Debugging using GV

To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

Swarm Debugging the Collective Intelligence on Interactive Debugging 23

61 Study design

The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

611 Subject System

For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

612 Participants

Fig 7 Java expertise

To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

20 httpwwwjabreforg21 httpswwwfreelancercom

24Please give a shorter version with authorrunning and titlerunning prior to maketitle

613 Task Description

We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

614 Artifacts and Working Environment

After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

615 Study Procedure

The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

22 The full qualitative evaluation survey is available on httpsgooglforms

c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

Swarm Debugging the Collective Intelligence on Interactive Debugging 25

group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

616 Data Collection

In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

62 Results

We now discuss the results of our evaluation

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

26Please give a shorter version with authorrunning and titlerunning prior to maketitle

number of participants who could propose a solution and the correctness ofthe solutions

For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

Fig 8 GV for Task 0318

For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

Fig 9 GV for Task 0667

Swarm Debugging the Collective Intelligence on Interactive Debugging 27

Fig 10 GV for Task 0669

Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

13

Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

28Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 11 GV usefulness - experimental phase one

Fig 12 GV usefulness - experimental phase two

The analysis of our results suggests that GV is useful to support software-maintenance tasks

Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

Swarm Debugging the Collective Intelligence on Interactive Debugging 29

Table 10 Results from control and experimental groups (average)

Task 0993

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000255 000340 -44 126

Time to start 000444 000518 -33 112

Elapsed time 003008 001605 843 53

Task 1026

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000242 000448 -126 177

Time to start 000402 000343 19 92

Elapsed time 002458 002041 257 83

63 Comparing Results from the Control and Experimental Groups

We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 3: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

Swarm Debugging the Collective Intelligence on Interactive Debugging 3

this information as context for their debugging Thus researchers need toolsto collect and share data about developersrsquo debugging sessions

Maalej et al [11] observed that capturing contextual information requiresthe instrumentation of the IDE and continuous observation of the developersrsquoactivities within the IDE Studies by Storey et al [12] showed that the newergeneration of developers who are proficient in social media are comfortablewith sharing such information Developers are nowadays open transparenteager to share their knowledge and generally willing to allow informationabout their activities to be collected by the IDEs automatically [12]

Considering this context we introduce the concept of Swarm Debug-ging (SD) to (1) capture debugging contextual information (2) share it and(3) reuse it across debugging sessions and developers We build the concept ofSwarm Debugging based on the idea that many developers performing debug-ging sessions independently are in fact building collective knowledge whichcan be shared and reused with adequate support Thus we are convinced thatdevelopers need support to collect store and share this knowledge ie in-formation from and about their debugging sessions including but not limitedto breakpoints locations visited statements and traversed paths To providesuch support Swarm Debugging includes (i) the Swarm Debugging Infrastruc-ture (SDI) with which practitioners and researchers can collect and share dataabout developersrsquo interactive debugging sessions and (ii) the Swarm Debug-ging Global View (GV) to display debugging paths

As a consequence of adopting SD an interesting question emerges whatdebugging information is useful to share among developers to ease debuggingDebugging provides a lot of information which could be possibly considereduseful to improve software comprehension but we are particularly interestedin two pieces of debugging information breakpoints (and their locations) andsessions (debugging paths) because these pieces of information are essentialfor the two main activities during debugging setting breakpoints and steppinginoverout statements

In general developers initiate an interactive debugging session by settinga breakpoint Setting a breakpoint is one of the most frequently used fea-tures of IDEs [13] To decide where to set a breakpoint developers use theirobservations recall their experiences with similar debugging tasks and formu-late hypotheses about their tasks [14] Tiarks and Rohms [15] observed thatdevelopers have difficulties in finding locations for setting the breakpointssuggesting that this is a demanding activity and that supporting developersto set appropriate breakpoints could reduce debugging effort

We conducted two sets of studies with the aim of understanding how de-velopers set breakpoints and navigate (step) during debugging sessions Inobservational studies we collected and analyzed more than 10 hours of devel-opersrsquo videos in 45 debugging sessions performed by 28 different independentdevelopers containing 307 breakpoints on three software systems These ob-servational studies help us understand how developers use breakpoints (RQ1to RQ4)

4Please give a shorter version with authorrunning and titlerunning prior to maketitle

We also conducted with 30 professional developers two studies a qualitativeevaluation and a controlled experiment to assess whether debugging sessionsshared through our Global View visualisation support developers in theirdebugging tasks and is useful for sharing debugging tasks among developers(R5 and RQ6) We collected participantsrsquo answers in electronic forms and morethan 3 hours of debugging sessions on video

This paper has the following contributions

ndash We introduce a novel approach for debugging named Swarm Debugging(SD) based on the concept of Swarm Intelligence and Information ForagingTheory

ndash We present an infrastructure the Swarm Debugging Infrastructure (SDI)to gather store and share data about interactive debugging activities tosupport SD

ndash We provide evidence about the relation between tasksrsquo elapsed time de-velopersrsquo expertise breakpoints setting and debugging patterns

ndash We present a new visualisation technique Global View (GV) built onshared debugging sessions by developers to ease debugging

ndash We provide evidence about the usefulness of sharing debugging session toease developersrsquo debugging

This paper extends our previous works [161718] as follows First we sum-marize the main characteristics of the Swarm Debugging approach providinga theoretical foundation to Swarm Debugging using Swarm Intelligence andInformation Foraging Theory Second we present the Swarm Debugging Infras-tructure (SDI) Third we perform an experiment on the debugging behaviorof 30 professional developers to evaluate if sharing debugging sessions supportsadequately their debugging tasks

The remainder of this article is organized as follows Section 2 providessome fundamentals of debugging and the foundations of SD the conceptsof swarm intelligence and information foraging theory Section 3 describesour approach and its implementation the Swarm Debugging InfrastructureSection 6 presents an experiment to assess the benefits that our SD approachcan bring to developers and Section 5 reports two experiments that wereconducted using SDI to understand developers debugging habits Next Section7 discusses implications of our results while Section 8 presents threats to thevalidity of our study Section 9 summarizes related work and finally Section10 concludes the paper and outlines future work

2 Background

This section provides background information about the debugging activityand setting breakpoints In the following we use failures as unintended be-haviours of a program ie when the program does something that it shouldnot and faults as the incorrect statements in source code causing failuresThe purpose of debugging is to locate and correct faults hence to fix failures

Swarm Debugging the Collective Intelligence on Interactive Debugging 5

21 Debugging and Interactive Debugging

The IEEE Standard Glossary of Software Engineering Terminology (see thedefinition at the beginning of Section 1) defines debugging as the act of de-tecting locating and correcting bugs in a computer program Debugging tech-niques include the use of breakpoints desk checking dumps inspection re-versible execution single-step operations and traces

Araki et al [19] describe debugging as a process where developers makehypotheses about the root-cause of a problem or defect and verify these hy-potheses by examining different parts of the source code of the program

Interactive debugging consists of using a tool ie a debugger to detectlocate and correct a fault in a program It is a process also known as programanimation stepping or following execution [20] Developers often refer to thisprocess simply as debugging because several IDEs provide debuggers to sup-port debugging However it must be noted that while debugging is the processof finding faults interactive debugging is one particular debugging approachin which developers use interactive tools Expressions such as interactive de-bugging stepping and debugging are used interchangeably and there is not yeta consensus on what is the best name for this process

22 Breakpoints and Supporting Mechanisms

Generally breakpoints allow pausing intentionally the execution of a programfor debugging purposes a means of acquiring knowledge about a program dur-ing its execution for example to examine the call stack and variable valueswhen the control flow reaches the locations of the breakpoints Thus a break-point indicates the location (line) in the source code of a program where apause occurs during its execution

Depending on the programming language its run-time environment (inparticular the capabilities of its virtual machines if any) and the debuggersdifferent types of breakpoints may be available to developers These types in-clude static breakpoints [21] that pause unconditionally the execution of aprogram and dynamic breakpoints [22] that pause depending on some con-ditions or threads or numbers of hits

Other types of breakpoints include watchpoints that pause the executionwhen a variable being watched is read andndashor written IDEs offer the meansto specify the different types of breakpoints depending on the programminglanguages and their run-time environment Fig 1-A and 1-B show examples ofstatic and dynamic breakpoints in Eclipse In the rest of this paper we focuson static breakpoints because they are the most used of all types [14]

There are different mechanisms for setting a breakpoint within the code

6Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 1 Setting a static breakpoint (A) and a conditional breakpoint (B)using Eclipse IDE

ndash GUI Most IDEs or browsers offer a visual way of adding a breakpoint usu-ally by clicking at the beginning of the line on which to set the breakpointChrome1 Visual Studio2 IntelliJ 3 and Xcode4

ndash Command line Some programming languages offer debugging tools on thecommand line so an IDE is not necessary to debug the code JDB5 PDB6and GDB7

ndash Code Some programming languages allow using syntactical elements to setbreakpoints as they were lsquoannotationsrsquo in the code This approach oftenonly supports the setting of a breakpoint and it is necessary to use itin conjunction with the command line or GUI Some examples are Rubydebugger8 Firefox 9 and Chrome10

There is a set of features in a debugger that allows developers to control theflow of the execution within the breakpoints ie Call Stack features whichenable continuing or stepping

A developer can opt for continuing in which case the debugger resumesexecution until the next breakpoint is reached or the program exits Con-versely stepping allows the developer to run step by step the entire program

1httpsdevelopersgooglecomwebtoolschrome-devtoolsjavascriptadd-breakpoints

2httpsmsdnmicrosoftcomen-uslibrary5557y8b4aspx

3httpswwwjetbrainscomhelpidea20163debugger-basicshtml

4httpjeffreysambellscom20140114using-breakpoints-in-xcode

5httpdocsoraclecomjavase7docstechnotestoolswindowsjdbhtml

6httpsdocspythonorg2librarypdbhtml

7ftpftpgnuorgoldgnuManualsgdb511html nodegdb 37html

8httpsgithubcomcldwalkerdebugger

9httpsdevelopermozillaorgpt-BRdocsWebJavaScriptReferenceStatementsdebugger

10httpsdevelopersgooglecomwebtoolschrome-devtoolsjavascriptadd-breakpoints

Swarm Debugging the Collective Intelligence on Interactive Debugging 7

flow The definition of a step varies across programming languages and debug-gers but it generally includes invoking a method and executing a statementWhile Stepping a developer can navigate between steps using the followingcommands

ndash Step Over the debugger steps over a given line If the line contains afunction then the function is executed and the result returned withoutstepping through each of its lines

ndash Step Into the debugger enters the function at the current line and continuestepping from there line-by-line

ndash Step Out this action would take the debugger back to the line where thecurrent function was called

To start an interactive debugging session developers set a breakpoint Ifnot the IDE would not stop and enter its interactive mode For exampleEclipse IDE automatically opens the ldquoDebugging Perspectiverdquo when executionhits a breakpoint A developer can run a system in debugging mode withoutsetting breakpoints but she must set a breakpoint to be able to stop theexecution step in and observe variable states Briefly there is no interactivedebugging session without at least one breakpoint set in the codeFinally some debuggers allow debugging remotely for example to performhot-fixes or to test mobile applications and systems operating in remote con-figurations

23 Self-organization and Swarm Intelligence

Self-organization is a concept emerged from Social Sciences and Biology and itis defined as the set of dynamic mechanisms enabling structures to appear atthe global level of a system from interactions among its lower-level componentswithout being explicitly coded at the lower levels Swarm intelligence (SI)describes the behavior resulting from the self-organization of social agents(as insects) [23] Ant nests and the societies that they house are examples ofSI [24] Individual ants can only perform relatively simple activities yet thewhole colony can collectively accomplish sophisticated activities Ants achieveSI by exchanging information encoded as chemical signalsmdashpheromones egindicating a path to follow or an obstacle to avoid

Similarly SI could be used as a metaphor to understand or explain thedevelopment of a multiversion large and complex software systems built bysoftware teams Individual developers can usually perform activities withouthaving a global understanding of the whole system [25] In a birdrsquos eye viewsoftware development is analogous to some SI in which groups of agents in-teracting locally with one another and with their environment and follow-ing simple rules lead to the emergence of global behaviors previously un-knownimpossible to the individual agents We claim that the similarities be-tween the SI of ant nests and complex software systems are not a coincidenceCockburn [26] suggested that the best architectures requirements and designs

8Please give a shorter version with authorrunning and titlerunning prior to maketitle

emerge from self-organizing developers growing in steps and following theirchanging knowledge and the changing wishes of the user community ie atypical example of swarm intelligence

Dev1

Dev2

Dev3

DevN

VisualisationsSearching Tools

Recommendation Systems

Single Debugging Session Crowd Debugging Sessions Debugging Information

Positive feedback

Collect data Store data

Transform information

A B C

D

Fig 2 Overview of the Swarm Debugging approach

24 Information Foraging

Information Foraging Theory (IFT) is based on the optimal foraging theorydeveloped by Pirolli and Card [27] to understand how people search for infor-mation IFT is rooted in biology studies and theories of how animals hunt forfood It was extended to debugging by Lawrance et al[27]

However no previous work proposes the sharing of knowledge related todebugging activities Differently from works that use IFT on a model onepreyone predator [28] we are interested in many developers working inde-pendently in many debugging sessions and sharing information to allow SI toemerge Thus debugging becomes a foraging process in a SI environment

These conceptsmdashSI and IFTmdashhave led to the design of a crowd approachapplied to debugging activities a different collective way of doing debuggingthat collects shares retrieves information from (previous and current) debug-ging sessions to support (current and future) debugging sessions

3 The Swarm Debugging Approach

Swarm Debugging (SD) uses swarm intelligence applied to interactive debug-ging data to create knowledge for supporting software development activitiesSwarm Debugging works as follows

Swarm Debugging the Collective Intelligence on Interactive Debugging 9

First several developers perform their individual independent debuggingactivities During these activities debugging events are collected by listeners(Label A in Figure 2) for example breakpoints-toggling and stepping events(Label B in Figure 2) that are then stored in a debugging-knowledge reposi-tory (Label C in Figure 2) For accessing this repository services are definedand implemented in the SDI For example stored events are processed bydedicated algorithms (Label D in Figure 2) (1) to create (several types of)visualizations (2) to offer (distinct ways of) searching and (3) to provide rec-ommendations to assist developers during debugging Recommendations arerelated to the locations where to toggle breakpoints Storing and using theseevents allow sharing developersrsquo knowledge among developers creating a col-lective intelligence about the software systems and their debugging

We chose to instrument the Eclipse IDE a popular IDE to implementSwarm Debugging and to reach a large number of users Also we use services inthe cloud to collect the debugging events to process these events and to providevisualizations and recommendations from these events Thus we decoupleddata collection from data usage allowing other researcherstools vendors touse the collected data

During debugging developers analyze the code toggling breakpoints andstepping in and through statements While traditional dynamic analysis ap-proaches collect all interactions states or events SD collects only invocationsexplicitly explored by developers SDI collects only visited areas and paths(chains of invocations by egStep Into or F5 in Eclipse IDE) and thus doesnot suffer from performance or memory issues as omniscient debuggers [29] ortracing-based approaches could

Our decision to record information about breakpoints and stepping is wellsupported by a study from Beller et al [30] A finding of this study is thatsetting breakpoints and stepping through code are the most used debuggingfeatures They showed that most of the recorded debugging events are relatedto the creation (4544) removal (4362) or adjustment of breakpoints hittingthem during debugging and stepping through the source code Furthermoreother advanced debugging features like defining watches and modifying vari-able values have been much less used [30]

4 SDI in a Nutshell

To evaluate the Swarm Debugging approach we have implemented the SwarmDebugging Infrastructure (see httpsgithubcomSwarmDebugging)The Swarm Debugging Infrastructure (SDI) [17] provides a set of tools forcollecting storing sharing retrieving and visualizing data collected duringdevelopersrsquo debugging activities The SDI is an Eclipse IDE11 plug-in inte-grated with Eclipse Debug core It is organized in three main modules (1) theSwarm Debugging Services (2) the Swarm Debugging Tracer and (3) Swarm

11 httpswwweclipseorg

10Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 3 GV elements - Types (nodes) invocations (edge) and Task filter area

Debugging Views All the implementation details of SDI are available in theAppendix section

41 Swarm Debugging Global View

Swarm Debugging Global View (GV) is a call graph for modeling softwarebased on directed call graph [31] to explicit the hierarchical relationship byinvocated methods This visualization use rounded gray boxes (Figure 3-A) torepresent types or classes (nodes) and oriented arrows (Figure 3-B) to expressinvocations (edges) GV is built using previous debugging session context datacollected by developers for different tasks

GV was implemented using CytoscapeJS [32] a Graph API JavaScriptframework applying an automatic layout manager breadthfirst As a web appli-cation the SD visualisations can be integrated into an Eclipse view as an SWTBrowser Widget or accessed through a traditional browser such as MozillaFirefox or Google Chrome

In this view the grey boxes are types that developers visited during debug-ging sessions The edges represent method calls (Step Into or F5 on Eclipse)performed by all developers in all traced tasks on a software project Eachedge colour represents a task and line thickness is proportional to the numberof invocations Each debugging session contributes with a context generat-ing the visualisation combining all collected invocations The visualisation isorganised in layers or stacks and each line is a layer of invocations The start-ing points (non-invoked methods) are allocated on top of a tree the adjacent

Swarm Debugging the Collective Intelligence on Interactive Debugging 11

Fig 4 GV on all tasks

nodes in an invocation sequence Besides developers can directly go to a typein the Eclipse Editor by double-clicking over a node in the diagram In the leftcorner developers can use radio buttons to filter invocations by task (figure 3-C) showing the paths used by developers during previous debugging sessionsby a task Finally developers can use the mouse to pan and zoom inout onthe visualisation Figure 4 shows an example of GV with all tasks for JabRefsystem and we have data about 8 tasks

GV is a contextual visualization that shows only the paths explicitlyand intentionally visited by developers including type declarations andmethod invocations explored by developers based on their decisions

5 Using SDI to Understand Debugging Activities

The first benefit of SDI is the fact that it allows for collecting detailed in-formation about debugging sessions Using this information researchers caninvestigate developers behaviors during debugging activities To illustrate thispoint we conducted two experiments using SDI to understand developers de-bugging habits the times and effort with which they set breakpoints and thelocations where they set breakpoints

Our analysis builds upon three independent sets of observations involvingin total three systems Studies 1 and 2 involved JabRef PDFSaM and Raptoras subject systems We analysed 45 video-recorded debugging sessions avail-able from our own collected videos (Study 1) and an empirical study performedby Jiang et al [33] (Study 2)

In this study we answered the following research questions

RQ1 Is there a correlation between the time of the first breakpoint and a de-bugging taskrsquos elapsed time

RQ2 What is the effort in time for setting the first breakpoint in relation to thedebugging taskrsquos elapsed time

12Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

In this section we elaborate more on each of the studies

51 Study 1 Observational Study on JabRef

511 Subject System

To conduct this first study we selected JabRef12 version 32 as subject sys-tem This choice was motivated by the fact that JabRefrsquos domain is easy tounderstand thus reducing any learning effect It is composed of relatively inde-pendent packages and classes ie high cohesion low coupling thus reducingthe potential commingle effect of low code quality

512 Participants

We recruited eight male professional developers via an Internet-based free-lancer service13 Two participants are experts and three are intermediate inJava Developers self-reported their expertise levels which thus should betaken with caution Also we recruited 12 undergraduate and graduate stu-dents at Polytechnique Montreal to participate in our study We surveyedall the participantsrsquo background information before the study14 The surveyincluded questions about participantsrsquo self-assessment on their level of pro-gramming expertise (Java IDE and Eclipse) gender first natural languageschooling level and knowledge about TDD interactive debugging and whyusually they use a debugger All participants stated that they had experiencein Java and worked regularly with the debugger of Eclipse

513 Task Description

We selected five defects reported in the issue-tracking system of JabRef Wechose the task of fixing the faults that would potentially require developers toset breakpoints in different Java classes To ensure this we manually conductedthe debugging ourselves and verified that for understanding the root causeof the faults we had to set at least two breakpoints during our interactivedebugging sessions Then we asked participants to find the locations of thefaults described in Issues 318 667 669 993 and 1026 Table 1 summarisesthe faults using their titles from the issue-tracking system

12 httpwwwjabreforg13 httpswwwfreelancercom14 Survey available on httpsgooglformsdxCQaBke2l2cqjB42

Swarm Debugging the Collective Intelligence on Interactive Debugging 13

Table 1 Summary of the issues considered in JabRef in Study 1

Issues Summaries

318 ldquoNormalize to Bibtex name formatrdquo

667 ldquohashpound sign causes URL link to failrdquo

669 ldquoJabRef 3132 writes bib file in a format

that it will not readrdquo

993 ldquoIssues in BibTeX source opens save dialog

and opens dialog Problem with parsing entryrsquo

multiple timesrdquo

1026 ldquoJabref removes comments

inside the Bibtex coderdquo

514 Artifacts and Working Environment

We provided the participants with a tutorial15 explaining how to install andconfigure the tools required for the study and how to use them through awarm-up task We also presented a video16 to guide the participants during thewarm-up task In a second document we described the five faults and the stepsto reproduce them We also provided participants with a video demonstratingstep-by-step how to reproduce the five defects to help them get started

We provided a pre-configured Eclipse workspace to the participants andasked them to install Java 8 Eclipse Mars 2 with the Swarm Debugging Tracerplug-in [17] to collect automatically breakpoint-related events The Eclipseworkspace contained two Java projects a Tetris game for the warm-up taskand JabRef v32 for the study We also required that the participants installand configure the Open Broadcaster Software17 (OBS) open-source softwarefor live streaming and recording We used the OBS to record the participantsrsquoscreens

515 Study Procedure

After installing their environments we asked participants to perform a warm-up task with a Tetris game The task consisted of starting a debugging sessionsetting a breakpoint and debugging the Tetris program to locate a givenmethod We used this task to confirm that the participantsrsquo environmentswere properly configured and also to accustom the participants with the studysettings It was a trivial task that we also used to filter the participants whowould have too little knowledge of Java Eclipse and Eclipse Java debugger

15 httpswarmdebuggingorgpublication16 httpsyoutubeU1sBMpfL2jc17 httpsobsprojectcom

14Please give a shorter version with authorrunning and titlerunning prior to maketitle

All participants who participated in our study correctly executed the warm-uptask

After performing the warm-up task each participant performed debuggingto locate the faults We established a maximum limit of one-hour per task andinformed the participants that the task would require about 20 minutes foreach fault which we will discuss as a possible threat to validity We based thislimit on previous experiences with these tasks during mock trials After theparticipants performed each task we asked them to answer a post-experimentquestionnaire to collect information about the study asking if they found thefaults where were the faults why the faults happened if they were tired anda general summary of their debugging experience

516 Data Collection

The Swarm Debugging Tracer plug-in automatically and transparently col-lected all debugging data (breakpoints stepping method invocations) Alsowe recorded the participantrsquos screens during their debugging sessions withOBS We collected the following data

ndash 28 video recordings one per participant and task which are essential tocontrol the quality of each session and to produce a reliable and repro-ducible chain of evidence for our results

ndash The statements (lines in the source code) where the participants set break-points We considered the following types of statements because they arerepresentative of the main concepts in any programming languagesndash call methodfunction invocationsndash return returns of valuesndash assignment settings of valuesndash if-statement conditional statementsndash while-loop loops iterations

ndash Summaries of the results of the study one per participant via a question-naire which included the following questionsndash Did you locate the faultndash Where was the faultndash Why did the fault happenndash Were you tiredndash How was your debugging experience

Based on this data we obtained or computed the following metrics perparticipant and task

ndash Start Time (ST ) the timestamp when the participant started a task Weanalysed each video and we started to count when effectively the partic-ipant started a task ie when she started the Swarm Debugging Tracerplug-in for example

ndash Time of First Breakpoint (FB) the time when the participant set her firstbreakpoint

ndash End time (T ) the time when the participant finished a task

Swarm Debugging the Collective Intelligence on Interactive Debugging 15

ndash Elapsed End time (ET ) ET = T minus STndash Elapsed Time First Breakpoint (EF ) EF = FB minus ST

We manually verified whether participants were successful or not at com-pleting their tasks by analysing the answers provided in the questionnaireand the videos We knew the locations of the faults because all tasks weresolved by JabRefrsquos developers who completed the corresponding reports inthe issue-tracking system with the changes that they made

52 Study 2 Empirical Study on PDFSaM and Raptor

The second study consisted of the re-analysis of 20 videos of debugging sessionsavailable from an empirical study on change-impact analysis with professionaldevelopers [33] The authors conducted their work in two phases In the firstphase they asked nine developers to read two fault reports from two open-source systems and to fix these faults The objective was to observe the devel-opersrsquo behaviour as they fixed the faults In the second phase they analysedthe developersrsquo behaviour to determine whether the developers used any toolsfor change-impact analysis and if not whether they performed change-impactanalysis manually

The two systems analysed in their study are PDF Split and Merge18 (PDF-SaM) and Raptor19 They chose one fault report per system for their studyThey chose these systems due to their non-trivial size and because the pur-poses and domains of these systems were clear and easy to understand [33]The choice of the fault reports followed the criteria that they were alreadysolved and that they could be understood by developers who did not knowthe systems Alongside each fault report they presented the developers withinformation about the systems their purpose their main entry points andinstructions for replicating the faults

53 Results

As can be noticed Studies 1 and 2 have different approaches The tasks inStudy 1 were fault location tasks developers did not correct the faults whilethe ones in Study 2 were fault correction tasks Moreover Study 1 exploredfive different faults while Study 2 only analysed one fault per system Thecollected data provide a diversity of cases and allow a rich in-depth view ofhow developers set breakpoints during different debugging sessions

In the following we present the results regarding each research questionaddressed in the two studies

18 httpwwwpdfsamorg19 httpscodegooglecompraptor-chess-interface

16Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

MFB =EF

ET(1)

Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

Tasks Average Times (min) Std Devs (min)

318 44 64

667 28 29

669 22 25

993 25 25

1026 25 17

PdfSam 54 18

Raptor 59 13

Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

Swarm Debugging the Collective Intelligence on Interactive Debugging 17

RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

f(x) =α

xβ(2)

where α = 12 and β = 044

Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

This finding also corroborates previous results found with a different set oftasks [17]

18Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

Table 3 Study 1 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 111 53

if-statement 39 19

assignment 36 17

return 18 10

while-loop 3 1

Table 4 Study 2 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 43 43

if-statement 22 22

assignment 27 27

return 4 4

while-loop 4 4

13

Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

Swarm Debugging the Collective Intelligence on Interactive Debugging 19

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

JabRefDesktopopenExternalViewer(metaData()

linktoString() field)

Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

Tasks Classes Lines of Code Breakpoints

0318 AuthorsFormatter 43 5

0318 AuthorsFormatter 131 3

0667 BasePanel 935 2

0667 BasePanel 969 3

0667 JabRefDesktop 430 2

0669 OpenDatabaseAction 268 2

0669 OpenDatabaseAction 433 4

0669 OpenDatabaseAction 451 4

0993 EntryEditor 717 2

0993 EntryEditor 720 2

0993 EntryEditor 723 2

0993 BibDatabase 187 2

0993 BibDatabase 456 2

1026 EntryEditor 1184 2

1026 BibtexParser 160 2

20Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

Classes Lines of Code Breakpoints

PdfReader 230 2

PdfReader 806 2

PdfReader 1923 2

ConsoleServicesFacade 89 2

ConsoleClient 81 2

PdfUtility 94 2

PdfUtility 96 2

PdfUtility 102 2

Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

Classes Lines of Code Breakpoints

icsUtils 333 3

Game 1751 2

ExamineController 41 2

ExamineController 84 3

ExamineController 87 2

ExamineController 92 2

When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

Swarm Debugging the Collective Intelligence on Interactive Debugging 21

Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

Classes Lines of Code Breakpoints

BibtexParser 138151159 222

160165168 323

176198199299 2222

EntryEditor 717720721 342

723837842 232

11841393 32

BibDatabase 175187223456 2326

OpenDatabaseAction 433450451 424

JabRefDesktop 4084430 223

SaveDatabaseAction 177188 42

BasePanel 935969 25

AuthorsFormatter 43131 54

EntryTableTransferHandler 346 2

FieldTextMenu 84 2

JabRefFrame 1119 2

JabRefMain 8 5

URLUtil 95 2

Fig 6 Methods with 5 or more breakpoints

Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

22Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 9 Study 1 - Breakpoints by class across different tasks

Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

SaveDatabaseAction Yes Yes Yes 7 2

BasePanel Yes Yes Yes Yes 14 7

JabRefDesktop Yes Yes 9 4

EntryEditor Yes Yes Yes 36 4

BibtexParser Yes Yes Yes 44 6

OpenDatabaseAction Yes Yes Yes 19 13

JabRef Yes Yes Yes 3 3

JabRefMain Yes Yes Yes Yes 5 4

URLUtil Yes Yes 4 2

BibDatabase Yes Yes Yes 19 4

points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

6 Evaluation of Swarm Debugging using GV

To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

Swarm Debugging the Collective Intelligence on Interactive Debugging 23

61 Study design

The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

611 Subject System

For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

612 Participants

Fig 7 Java expertise

To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

20 httpwwwjabreforg21 httpswwwfreelancercom

24Please give a shorter version with authorrunning and titlerunning prior to maketitle

613 Task Description

We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

614 Artifacts and Working Environment

After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

615 Study Procedure

The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

22 The full qualitative evaluation survey is available on httpsgooglforms

c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

Swarm Debugging the Collective Intelligence on Interactive Debugging 25

group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

616 Data Collection

In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

62 Results

We now discuss the results of our evaluation

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

26Please give a shorter version with authorrunning and titlerunning prior to maketitle

number of participants who could propose a solution and the correctness ofthe solutions

For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

Fig 8 GV for Task 0318

For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

Fig 9 GV for Task 0667

Swarm Debugging the Collective Intelligence on Interactive Debugging 27

Fig 10 GV for Task 0669

Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

13

Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

28Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 11 GV usefulness - experimental phase one

Fig 12 GV usefulness - experimental phase two

The analysis of our results suggests that GV is useful to support software-maintenance tasks

Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

Swarm Debugging the Collective Intelligence on Interactive Debugging 29

Table 10 Results from control and experimental groups (average)

Task 0993

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000255 000340 -44 126

Time to start 000444 000518 -33 112

Elapsed time 003008 001605 843 53

Task 1026

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000242 000448 -126 177

Time to start 000402 000343 19 92

Elapsed time 002458 002041 257 83

63 Comparing Results from the Control and Experimental Groups

We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 4: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

4Please give a shorter version with authorrunning and titlerunning prior to maketitle

We also conducted with 30 professional developers two studies a qualitativeevaluation and a controlled experiment to assess whether debugging sessionsshared through our Global View visualisation support developers in theirdebugging tasks and is useful for sharing debugging tasks among developers(R5 and RQ6) We collected participantsrsquo answers in electronic forms and morethan 3 hours of debugging sessions on video

This paper has the following contributions

ndash We introduce a novel approach for debugging named Swarm Debugging(SD) based on the concept of Swarm Intelligence and Information ForagingTheory

ndash We present an infrastructure the Swarm Debugging Infrastructure (SDI)to gather store and share data about interactive debugging activities tosupport SD

ndash We provide evidence about the relation between tasksrsquo elapsed time de-velopersrsquo expertise breakpoints setting and debugging patterns

ndash We present a new visualisation technique Global View (GV) built onshared debugging sessions by developers to ease debugging

ndash We provide evidence about the usefulness of sharing debugging session toease developersrsquo debugging

This paper extends our previous works [161718] as follows First we sum-marize the main characteristics of the Swarm Debugging approach providinga theoretical foundation to Swarm Debugging using Swarm Intelligence andInformation Foraging Theory Second we present the Swarm Debugging Infras-tructure (SDI) Third we perform an experiment on the debugging behaviorof 30 professional developers to evaluate if sharing debugging sessions supportsadequately their debugging tasks

The remainder of this article is organized as follows Section 2 providessome fundamentals of debugging and the foundations of SD the conceptsof swarm intelligence and information foraging theory Section 3 describesour approach and its implementation the Swarm Debugging InfrastructureSection 6 presents an experiment to assess the benefits that our SD approachcan bring to developers and Section 5 reports two experiments that wereconducted using SDI to understand developers debugging habits Next Section7 discusses implications of our results while Section 8 presents threats to thevalidity of our study Section 9 summarizes related work and finally Section10 concludes the paper and outlines future work

2 Background

This section provides background information about the debugging activityand setting breakpoints In the following we use failures as unintended be-haviours of a program ie when the program does something that it shouldnot and faults as the incorrect statements in source code causing failuresThe purpose of debugging is to locate and correct faults hence to fix failures

Swarm Debugging the Collective Intelligence on Interactive Debugging 5

21 Debugging and Interactive Debugging

The IEEE Standard Glossary of Software Engineering Terminology (see thedefinition at the beginning of Section 1) defines debugging as the act of de-tecting locating and correcting bugs in a computer program Debugging tech-niques include the use of breakpoints desk checking dumps inspection re-versible execution single-step operations and traces

Araki et al [19] describe debugging as a process where developers makehypotheses about the root-cause of a problem or defect and verify these hy-potheses by examining different parts of the source code of the program

Interactive debugging consists of using a tool ie a debugger to detectlocate and correct a fault in a program It is a process also known as programanimation stepping or following execution [20] Developers often refer to thisprocess simply as debugging because several IDEs provide debuggers to sup-port debugging However it must be noted that while debugging is the processof finding faults interactive debugging is one particular debugging approachin which developers use interactive tools Expressions such as interactive de-bugging stepping and debugging are used interchangeably and there is not yeta consensus on what is the best name for this process

22 Breakpoints and Supporting Mechanisms

Generally breakpoints allow pausing intentionally the execution of a programfor debugging purposes a means of acquiring knowledge about a program dur-ing its execution for example to examine the call stack and variable valueswhen the control flow reaches the locations of the breakpoints Thus a break-point indicates the location (line) in the source code of a program where apause occurs during its execution

Depending on the programming language its run-time environment (inparticular the capabilities of its virtual machines if any) and the debuggersdifferent types of breakpoints may be available to developers These types in-clude static breakpoints [21] that pause unconditionally the execution of aprogram and dynamic breakpoints [22] that pause depending on some con-ditions or threads or numbers of hits

Other types of breakpoints include watchpoints that pause the executionwhen a variable being watched is read andndashor written IDEs offer the meansto specify the different types of breakpoints depending on the programminglanguages and their run-time environment Fig 1-A and 1-B show examples ofstatic and dynamic breakpoints in Eclipse In the rest of this paper we focuson static breakpoints because they are the most used of all types [14]

There are different mechanisms for setting a breakpoint within the code

6Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 1 Setting a static breakpoint (A) and a conditional breakpoint (B)using Eclipse IDE

ndash GUI Most IDEs or browsers offer a visual way of adding a breakpoint usu-ally by clicking at the beginning of the line on which to set the breakpointChrome1 Visual Studio2 IntelliJ 3 and Xcode4

ndash Command line Some programming languages offer debugging tools on thecommand line so an IDE is not necessary to debug the code JDB5 PDB6and GDB7

ndash Code Some programming languages allow using syntactical elements to setbreakpoints as they were lsquoannotationsrsquo in the code This approach oftenonly supports the setting of a breakpoint and it is necessary to use itin conjunction with the command line or GUI Some examples are Rubydebugger8 Firefox 9 and Chrome10

There is a set of features in a debugger that allows developers to control theflow of the execution within the breakpoints ie Call Stack features whichenable continuing or stepping

A developer can opt for continuing in which case the debugger resumesexecution until the next breakpoint is reached or the program exits Con-versely stepping allows the developer to run step by step the entire program

1httpsdevelopersgooglecomwebtoolschrome-devtoolsjavascriptadd-breakpoints

2httpsmsdnmicrosoftcomen-uslibrary5557y8b4aspx

3httpswwwjetbrainscomhelpidea20163debugger-basicshtml

4httpjeffreysambellscom20140114using-breakpoints-in-xcode

5httpdocsoraclecomjavase7docstechnotestoolswindowsjdbhtml

6httpsdocspythonorg2librarypdbhtml

7ftpftpgnuorgoldgnuManualsgdb511html nodegdb 37html

8httpsgithubcomcldwalkerdebugger

9httpsdevelopermozillaorgpt-BRdocsWebJavaScriptReferenceStatementsdebugger

10httpsdevelopersgooglecomwebtoolschrome-devtoolsjavascriptadd-breakpoints

Swarm Debugging the Collective Intelligence on Interactive Debugging 7

flow The definition of a step varies across programming languages and debug-gers but it generally includes invoking a method and executing a statementWhile Stepping a developer can navigate between steps using the followingcommands

ndash Step Over the debugger steps over a given line If the line contains afunction then the function is executed and the result returned withoutstepping through each of its lines

ndash Step Into the debugger enters the function at the current line and continuestepping from there line-by-line

ndash Step Out this action would take the debugger back to the line where thecurrent function was called

To start an interactive debugging session developers set a breakpoint Ifnot the IDE would not stop and enter its interactive mode For exampleEclipse IDE automatically opens the ldquoDebugging Perspectiverdquo when executionhits a breakpoint A developer can run a system in debugging mode withoutsetting breakpoints but she must set a breakpoint to be able to stop theexecution step in and observe variable states Briefly there is no interactivedebugging session without at least one breakpoint set in the codeFinally some debuggers allow debugging remotely for example to performhot-fixes or to test mobile applications and systems operating in remote con-figurations

23 Self-organization and Swarm Intelligence

Self-organization is a concept emerged from Social Sciences and Biology and itis defined as the set of dynamic mechanisms enabling structures to appear atthe global level of a system from interactions among its lower-level componentswithout being explicitly coded at the lower levels Swarm intelligence (SI)describes the behavior resulting from the self-organization of social agents(as insects) [23] Ant nests and the societies that they house are examples ofSI [24] Individual ants can only perform relatively simple activities yet thewhole colony can collectively accomplish sophisticated activities Ants achieveSI by exchanging information encoded as chemical signalsmdashpheromones egindicating a path to follow or an obstacle to avoid

Similarly SI could be used as a metaphor to understand or explain thedevelopment of a multiversion large and complex software systems built bysoftware teams Individual developers can usually perform activities withouthaving a global understanding of the whole system [25] In a birdrsquos eye viewsoftware development is analogous to some SI in which groups of agents in-teracting locally with one another and with their environment and follow-ing simple rules lead to the emergence of global behaviors previously un-knownimpossible to the individual agents We claim that the similarities be-tween the SI of ant nests and complex software systems are not a coincidenceCockburn [26] suggested that the best architectures requirements and designs

8Please give a shorter version with authorrunning and titlerunning prior to maketitle

emerge from self-organizing developers growing in steps and following theirchanging knowledge and the changing wishes of the user community ie atypical example of swarm intelligence

Dev1

Dev2

Dev3

DevN

VisualisationsSearching Tools

Recommendation Systems

Single Debugging Session Crowd Debugging Sessions Debugging Information

Positive feedback

Collect data Store data

Transform information

A B C

D

Fig 2 Overview of the Swarm Debugging approach

24 Information Foraging

Information Foraging Theory (IFT) is based on the optimal foraging theorydeveloped by Pirolli and Card [27] to understand how people search for infor-mation IFT is rooted in biology studies and theories of how animals hunt forfood It was extended to debugging by Lawrance et al[27]

However no previous work proposes the sharing of knowledge related todebugging activities Differently from works that use IFT on a model onepreyone predator [28] we are interested in many developers working inde-pendently in many debugging sessions and sharing information to allow SI toemerge Thus debugging becomes a foraging process in a SI environment

These conceptsmdashSI and IFTmdashhave led to the design of a crowd approachapplied to debugging activities a different collective way of doing debuggingthat collects shares retrieves information from (previous and current) debug-ging sessions to support (current and future) debugging sessions

3 The Swarm Debugging Approach

Swarm Debugging (SD) uses swarm intelligence applied to interactive debug-ging data to create knowledge for supporting software development activitiesSwarm Debugging works as follows

Swarm Debugging the Collective Intelligence on Interactive Debugging 9

First several developers perform their individual independent debuggingactivities During these activities debugging events are collected by listeners(Label A in Figure 2) for example breakpoints-toggling and stepping events(Label B in Figure 2) that are then stored in a debugging-knowledge reposi-tory (Label C in Figure 2) For accessing this repository services are definedand implemented in the SDI For example stored events are processed bydedicated algorithms (Label D in Figure 2) (1) to create (several types of)visualizations (2) to offer (distinct ways of) searching and (3) to provide rec-ommendations to assist developers during debugging Recommendations arerelated to the locations where to toggle breakpoints Storing and using theseevents allow sharing developersrsquo knowledge among developers creating a col-lective intelligence about the software systems and their debugging

We chose to instrument the Eclipse IDE a popular IDE to implementSwarm Debugging and to reach a large number of users Also we use services inthe cloud to collect the debugging events to process these events and to providevisualizations and recommendations from these events Thus we decoupleddata collection from data usage allowing other researcherstools vendors touse the collected data

During debugging developers analyze the code toggling breakpoints andstepping in and through statements While traditional dynamic analysis ap-proaches collect all interactions states or events SD collects only invocationsexplicitly explored by developers SDI collects only visited areas and paths(chains of invocations by egStep Into or F5 in Eclipse IDE) and thus doesnot suffer from performance or memory issues as omniscient debuggers [29] ortracing-based approaches could

Our decision to record information about breakpoints and stepping is wellsupported by a study from Beller et al [30] A finding of this study is thatsetting breakpoints and stepping through code are the most used debuggingfeatures They showed that most of the recorded debugging events are relatedto the creation (4544) removal (4362) or adjustment of breakpoints hittingthem during debugging and stepping through the source code Furthermoreother advanced debugging features like defining watches and modifying vari-able values have been much less used [30]

4 SDI in a Nutshell

To evaluate the Swarm Debugging approach we have implemented the SwarmDebugging Infrastructure (see httpsgithubcomSwarmDebugging)The Swarm Debugging Infrastructure (SDI) [17] provides a set of tools forcollecting storing sharing retrieving and visualizing data collected duringdevelopersrsquo debugging activities The SDI is an Eclipse IDE11 plug-in inte-grated with Eclipse Debug core It is organized in three main modules (1) theSwarm Debugging Services (2) the Swarm Debugging Tracer and (3) Swarm

11 httpswwweclipseorg

10Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 3 GV elements - Types (nodes) invocations (edge) and Task filter area

Debugging Views All the implementation details of SDI are available in theAppendix section

41 Swarm Debugging Global View

Swarm Debugging Global View (GV) is a call graph for modeling softwarebased on directed call graph [31] to explicit the hierarchical relationship byinvocated methods This visualization use rounded gray boxes (Figure 3-A) torepresent types or classes (nodes) and oriented arrows (Figure 3-B) to expressinvocations (edges) GV is built using previous debugging session context datacollected by developers for different tasks

GV was implemented using CytoscapeJS [32] a Graph API JavaScriptframework applying an automatic layout manager breadthfirst As a web appli-cation the SD visualisations can be integrated into an Eclipse view as an SWTBrowser Widget or accessed through a traditional browser such as MozillaFirefox or Google Chrome

In this view the grey boxes are types that developers visited during debug-ging sessions The edges represent method calls (Step Into or F5 on Eclipse)performed by all developers in all traced tasks on a software project Eachedge colour represents a task and line thickness is proportional to the numberof invocations Each debugging session contributes with a context generat-ing the visualisation combining all collected invocations The visualisation isorganised in layers or stacks and each line is a layer of invocations The start-ing points (non-invoked methods) are allocated on top of a tree the adjacent

Swarm Debugging the Collective Intelligence on Interactive Debugging 11

Fig 4 GV on all tasks

nodes in an invocation sequence Besides developers can directly go to a typein the Eclipse Editor by double-clicking over a node in the diagram In the leftcorner developers can use radio buttons to filter invocations by task (figure 3-C) showing the paths used by developers during previous debugging sessionsby a task Finally developers can use the mouse to pan and zoom inout onthe visualisation Figure 4 shows an example of GV with all tasks for JabRefsystem and we have data about 8 tasks

GV is a contextual visualization that shows only the paths explicitlyand intentionally visited by developers including type declarations andmethod invocations explored by developers based on their decisions

5 Using SDI to Understand Debugging Activities

The first benefit of SDI is the fact that it allows for collecting detailed in-formation about debugging sessions Using this information researchers caninvestigate developers behaviors during debugging activities To illustrate thispoint we conducted two experiments using SDI to understand developers de-bugging habits the times and effort with which they set breakpoints and thelocations where they set breakpoints

Our analysis builds upon three independent sets of observations involvingin total three systems Studies 1 and 2 involved JabRef PDFSaM and Raptoras subject systems We analysed 45 video-recorded debugging sessions avail-able from our own collected videos (Study 1) and an empirical study performedby Jiang et al [33] (Study 2)

In this study we answered the following research questions

RQ1 Is there a correlation between the time of the first breakpoint and a de-bugging taskrsquos elapsed time

RQ2 What is the effort in time for setting the first breakpoint in relation to thedebugging taskrsquos elapsed time

12Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

In this section we elaborate more on each of the studies

51 Study 1 Observational Study on JabRef

511 Subject System

To conduct this first study we selected JabRef12 version 32 as subject sys-tem This choice was motivated by the fact that JabRefrsquos domain is easy tounderstand thus reducing any learning effect It is composed of relatively inde-pendent packages and classes ie high cohesion low coupling thus reducingthe potential commingle effect of low code quality

512 Participants

We recruited eight male professional developers via an Internet-based free-lancer service13 Two participants are experts and three are intermediate inJava Developers self-reported their expertise levels which thus should betaken with caution Also we recruited 12 undergraduate and graduate stu-dents at Polytechnique Montreal to participate in our study We surveyedall the participantsrsquo background information before the study14 The surveyincluded questions about participantsrsquo self-assessment on their level of pro-gramming expertise (Java IDE and Eclipse) gender first natural languageschooling level and knowledge about TDD interactive debugging and whyusually they use a debugger All participants stated that they had experiencein Java and worked regularly with the debugger of Eclipse

513 Task Description

We selected five defects reported in the issue-tracking system of JabRef Wechose the task of fixing the faults that would potentially require developers toset breakpoints in different Java classes To ensure this we manually conductedthe debugging ourselves and verified that for understanding the root causeof the faults we had to set at least two breakpoints during our interactivedebugging sessions Then we asked participants to find the locations of thefaults described in Issues 318 667 669 993 and 1026 Table 1 summarisesthe faults using their titles from the issue-tracking system

12 httpwwwjabreforg13 httpswwwfreelancercom14 Survey available on httpsgooglformsdxCQaBke2l2cqjB42

Swarm Debugging the Collective Intelligence on Interactive Debugging 13

Table 1 Summary of the issues considered in JabRef in Study 1

Issues Summaries

318 ldquoNormalize to Bibtex name formatrdquo

667 ldquohashpound sign causes URL link to failrdquo

669 ldquoJabRef 3132 writes bib file in a format

that it will not readrdquo

993 ldquoIssues in BibTeX source opens save dialog

and opens dialog Problem with parsing entryrsquo

multiple timesrdquo

1026 ldquoJabref removes comments

inside the Bibtex coderdquo

514 Artifacts and Working Environment

We provided the participants with a tutorial15 explaining how to install andconfigure the tools required for the study and how to use them through awarm-up task We also presented a video16 to guide the participants during thewarm-up task In a second document we described the five faults and the stepsto reproduce them We also provided participants with a video demonstratingstep-by-step how to reproduce the five defects to help them get started

We provided a pre-configured Eclipse workspace to the participants andasked them to install Java 8 Eclipse Mars 2 with the Swarm Debugging Tracerplug-in [17] to collect automatically breakpoint-related events The Eclipseworkspace contained two Java projects a Tetris game for the warm-up taskand JabRef v32 for the study We also required that the participants installand configure the Open Broadcaster Software17 (OBS) open-source softwarefor live streaming and recording We used the OBS to record the participantsrsquoscreens

515 Study Procedure

After installing their environments we asked participants to perform a warm-up task with a Tetris game The task consisted of starting a debugging sessionsetting a breakpoint and debugging the Tetris program to locate a givenmethod We used this task to confirm that the participantsrsquo environmentswere properly configured and also to accustom the participants with the studysettings It was a trivial task that we also used to filter the participants whowould have too little knowledge of Java Eclipse and Eclipse Java debugger

15 httpswarmdebuggingorgpublication16 httpsyoutubeU1sBMpfL2jc17 httpsobsprojectcom

14Please give a shorter version with authorrunning and titlerunning prior to maketitle

All participants who participated in our study correctly executed the warm-uptask

After performing the warm-up task each participant performed debuggingto locate the faults We established a maximum limit of one-hour per task andinformed the participants that the task would require about 20 minutes foreach fault which we will discuss as a possible threat to validity We based thislimit on previous experiences with these tasks during mock trials After theparticipants performed each task we asked them to answer a post-experimentquestionnaire to collect information about the study asking if they found thefaults where were the faults why the faults happened if they were tired anda general summary of their debugging experience

516 Data Collection

The Swarm Debugging Tracer plug-in automatically and transparently col-lected all debugging data (breakpoints stepping method invocations) Alsowe recorded the participantrsquos screens during their debugging sessions withOBS We collected the following data

ndash 28 video recordings one per participant and task which are essential tocontrol the quality of each session and to produce a reliable and repro-ducible chain of evidence for our results

ndash The statements (lines in the source code) where the participants set break-points We considered the following types of statements because they arerepresentative of the main concepts in any programming languagesndash call methodfunction invocationsndash return returns of valuesndash assignment settings of valuesndash if-statement conditional statementsndash while-loop loops iterations

ndash Summaries of the results of the study one per participant via a question-naire which included the following questionsndash Did you locate the faultndash Where was the faultndash Why did the fault happenndash Were you tiredndash How was your debugging experience

Based on this data we obtained or computed the following metrics perparticipant and task

ndash Start Time (ST ) the timestamp when the participant started a task Weanalysed each video and we started to count when effectively the partic-ipant started a task ie when she started the Swarm Debugging Tracerplug-in for example

ndash Time of First Breakpoint (FB) the time when the participant set her firstbreakpoint

ndash End time (T ) the time when the participant finished a task

Swarm Debugging the Collective Intelligence on Interactive Debugging 15

ndash Elapsed End time (ET ) ET = T minus STndash Elapsed Time First Breakpoint (EF ) EF = FB minus ST

We manually verified whether participants were successful or not at com-pleting their tasks by analysing the answers provided in the questionnaireand the videos We knew the locations of the faults because all tasks weresolved by JabRefrsquos developers who completed the corresponding reports inthe issue-tracking system with the changes that they made

52 Study 2 Empirical Study on PDFSaM and Raptor

The second study consisted of the re-analysis of 20 videos of debugging sessionsavailable from an empirical study on change-impact analysis with professionaldevelopers [33] The authors conducted their work in two phases In the firstphase they asked nine developers to read two fault reports from two open-source systems and to fix these faults The objective was to observe the devel-opersrsquo behaviour as they fixed the faults In the second phase they analysedthe developersrsquo behaviour to determine whether the developers used any toolsfor change-impact analysis and if not whether they performed change-impactanalysis manually

The two systems analysed in their study are PDF Split and Merge18 (PDF-SaM) and Raptor19 They chose one fault report per system for their studyThey chose these systems due to their non-trivial size and because the pur-poses and domains of these systems were clear and easy to understand [33]The choice of the fault reports followed the criteria that they were alreadysolved and that they could be understood by developers who did not knowthe systems Alongside each fault report they presented the developers withinformation about the systems their purpose their main entry points andinstructions for replicating the faults

53 Results

As can be noticed Studies 1 and 2 have different approaches The tasks inStudy 1 were fault location tasks developers did not correct the faults whilethe ones in Study 2 were fault correction tasks Moreover Study 1 exploredfive different faults while Study 2 only analysed one fault per system Thecollected data provide a diversity of cases and allow a rich in-depth view ofhow developers set breakpoints during different debugging sessions

In the following we present the results regarding each research questionaddressed in the two studies

18 httpwwwpdfsamorg19 httpscodegooglecompraptor-chess-interface

16Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

MFB =EF

ET(1)

Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

Tasks Average Times (min) Std Devs (min)

318 44 64

667 28 29

669 22 25

993 25 25

1026 25 17

PdfSam 54 18

Raptor 59 13

Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

Swarm Debugging the Collective Intelligence on Interactive Debugging 17

RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

f(x) =α

xβ(2)

where α = 12 and β = 044

Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

This finding also corroborates previous results found with a different set oftasks [17]

18Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

Table 3 Study 1 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 111 53

if-statement 39 19

assignment 36 17

return 18 10

while-loop 3 1

Table 4 Study 2 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 43 43

if-statement 22 22

assignment 27 27

return 4 4

while-loop 4 4

13

Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

Swarm Debugging the Collective Intelligence on Interactive Debugging 19

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

JabRefDesktopopenExternalViewer(metaData()

linktoString() field)

Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

Tasks Classes Lines of Code Breakpoints

0318 AuthorsFormatter 43 5

0318 AuthorsFormatter 131 3

0667 BasePanel 935 2

0667 BasePanel 969 3

0667 JabRefDesktop 430 2

0669 OpenDatabaseAction 268 2

0669 OpenDatabaseAction 433 4

0669 OpenDatabaseAction 451 4

0993 EntryEditor 717 2

0993 EntryEditor 720 2

0993 EntryEditor 723 2

0993 BibDatabase 187 2

0993 BibDatabase 456 2

1026 EntryEditor 1184 2

1026 BibtexParser 160 2

20Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

Classes Lines of Code Breakpoints

PdfReader 230 2

PdfReader 806 2

PdfReader 1923 2

ConsoleServicesFacade 89 2

ConsoleClient 81 2

PdfUtility 94 2

PdfUtility 96 2

PdfUtility 102 2

Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

Classes Lines of Code Breakpoints

icsUtils 333 3

Game 1751 2

ExamineController 41 2

ExamineController 84 3

ExamineController 87 2

ExamineController 92 2

When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

Swarm Debugging the Collective Intelligence on Interactive Debugging 21

Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

Classes Lines of Code Breakpoints

BibtexParser 138151159 222

160165168 323

176198199299 2222

EntryEditor 717720721 342

723837842 232

11841393 32

BibDatabase 175187223456 2326

OpenDatabaseAction 433450451 424

JabRefDesktop 4084430 223

SaveDatabaseAction 177188 42

BasePanel 935969 25

AuthorsFormatter 43131 54

EntryTableTransferHandler 346 2

FieldTextMenu 84 2

JabRefFrame 1119 2

JabRefMain 8 5

URLUtil 95 2

Fig 6 Methods with 5 or more breakpoints

Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

22Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 9 Study 1 - Breakpoints by class across different tasks

Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

SaveDatabaseAction Yes Yes Yes 7 2

BasePanel Yes Yes Yes Yes 14 7

JabRefDesktop Yes Yes 9 4

EntryEditor Yes Yes Yes 36 4

BibtexParser Yes Yes Yes 44 6

OpenDatabaseAction Yes Yes Yes 19 13

JabRef Yes Yes Yes 3 3

JabRefMain Yes Yes Yes Yes 5 4

URLUtil Yes Yes 4 2

BibDatabase Yes Yes Yes 19 4

points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

6 Evaluation of Swarm Debugging using GV

To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

Swarm Debugging the Collective Intelligence on Interactive Debugging 23

61 Study design

The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

611 Subject System

For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

612 Participants

Fig 7 Java expertise

To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

20 httpwwwjabreforg21 httpswwwfreelancercom

24Please give a shorter version with authorrunning and titlerunning prior to maketitle

613 Task Description

We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

614 Artifacts and Working Environment

After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

615 Study Procedure

The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

22 The full qualitative evaluation survey is available on httpsgooglforms

c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

Swarm Debugging the Collective Intelligence on Interactive Debugging 25

group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

616 Data Collection

In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

62 Results

We now discuss the results of our evaluation

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

26Please give a shorter version with authorrunning and titlerunning prior to maketitle

number of participants who could propose a solution and the correctness ofthe solutions

For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

Fig 8 GV for Task 0318

For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

Fig 9 GV for Task 0667

Swarm Debugging the Collective Intelligence on Interactive Debugging 27

Fig 10 GV for Task 0669

Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

13

Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

28Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 11 GV usefulness - experimental phase one

Fig 12 GV usefulness - experimental phase two

The analysis of our results suggests that GV is useful to support software-maintenance tasks

Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

Swarm Debugging the Collective Intelligence on Interactive Debugging 29

Table 10 Results from control and experimental groups (average)

Task 0993

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000255 000340 -44 126

Time to start 000444 000518 -33 112

Elapsed time 003008 001605 843 53

Task 1026

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000242 000448 -126 177

Time to start 000402 000343 19 92

Elapsed time 002458 002041 257 83

63 Comparing Results from the Control and Experimental Groups

We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 5: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

Swarm Debugging the Collective Intelligence on Interactive Debugging 5

21 Debugging and Interactive Debugging

The IEEE Standard Glossary of Software Engineering Terminology (see thedefinition at the beginning of Section 1) defines debugging as the act of de-tecting locating and correcting bugs in a computer program Debugging tech-niques include the use of breakpoints desk checking dumps inspection re-versible execution single-step operations and traces

Araki et al [19] describe debugging as a process where developers makehypotheses about the root-cause of a problem or defect and verify these hy-potheses by examining different parts of the source code of the program

Interactive debugging consists of using a tool ie a debugger to detectlocate and correct a fault in a program It is a process also known as programanimation stepping or following execution [20] Developers often refer to thisprocess simply as debugging because several IDEs provide debuggers to sup-port debugging However it must be noted that while debugging is the processof finding faults interactive debugging is one particular debugging approachin which developers use interactive tools Expressions such as interactive de-bugging stepping and debugging are used interchangeably and there is not yeta consensus on what is the best name for this process

22 Breakpoints and Supporting Mechanisms

Generally breakpoints allow pausing intentionally the execution of a programfor debugging purposes a means of acquiring knowledge about a program dur-ing its execution for example to examine the call stack and variable valueswhen the control flow reaches the locations of the breakpoints Thus a break-point indicates the location (line) in the source code of a program where apause occurs during its execution

Depending on the programming language its run-time environment (inparticular the capabilities of its virtual machines if any) and the debuggersdifferent types of breakpoints may be available to developers These types in-clude static breakpoints [21] that pause unconditionally the execution of aprogram and dynamic breakpoints [22] that pause depending on some con-ditions or threads or numbers of hits

Other types of breakpoints include watchpoints that pause the executionwhen a variable being watched is read andndashor written IDEs offer the meansto specify the different types of breakpoints depending on the programminglanguages and their run-time environment Fig 1-A and 1-B show examples ofstatic and dynamic breakpoints in Eclipse In the rest of this paper we focuson static breakpoints because they are the most used of all types [14]

There are different mechanisms for setting a breakpoint within the code

6Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 1 Setting a static breakpoint (A) and a conditional breakpoint (B)using Eclipse IDE

ndash GUI Most IDEs or browsers offer a visual way of adding a breakpoint usu-ally by clicking at the beginning of the line on which to set the breakpointChrome1 Visual Studio2 IntelliJ 3 and Xcode4

ndash Command line Some programming languages offer debugging tools on thecommand line so an IDE is not necessary to debug the code JDB5 PDB6and GDB7

ndash Code Some programming languages allow using syntactical elements to setbreakpoints as they were lsquoannotationsrsquo in the code This approach oftenonly supports the setting of a breakpoint and it is necessary to use itin conjunction with the command line or GUI Some examples are Rubydebugger8 Firefox 9 and Chrome10

There is a set of features in a debugger that allows developers to control theflow of the execution within the breakpoints ie Call Stack features whichenable continuing or stepping

A developer can opt for continuing in which case the debugger resumesexecution until the next breakpoint is reached or the program exits Con-versely stepping allows the developer to run step by step the entire program

1httpsdevelopersgooglecomwebtoolschrome-devtoolsjavascriptadd-breakpoints

2httpsmsdnmicrosoftcomen-uslibrary5557y8b4aspx

3httpswwwjetbrainscomhelpidea20163debugger-basicshtml

4httpjeffreysambellscom20140114using-breakpoints-in-xcode

5httpdocsoraclecomjavase7docstechnotestoolswindowsjdbhtml

6httpsdocspythonorg2librarypdbhtml

7ftpftpgnuorgoldgnuManualsgdb511html nodegdb 37html

8httpsgithubcomcldwalkerdebugger

9httpsdevelopermozillaorgpt-BRdocsWebJavaScriptReferenceStatementsdebugger

10httpsdevelopersgooglecomwebtoolschrome-devtoolsjavascriptadd-breakpoints

Swarm Debugging the Collective Intelligence on Interactive Debugging 7

flow The definition of a step varies across programming languages and debug-gers but it generally includes invoking a method and executing a statementWhile Stepping a developer can navigate between steps using the followingcommands

ndash Step Over the debugger steps over a given line If the line contains afunction then the function is executed and the result returned withoutstepping through each of its lines

ndash Step Into the debugger enters the function at the current line and continuestepping from there line-by-line

ndash Step Out this action would take the debugger back to the line where thecurrent function was called

To start an interactive debugging session developers set a breakpoint Ifnot the IDE would not stop and enter its interactive mode For exampleEclipse IDE automatically opens the ldquoDebugging Perspectiverdquo when executionhits a breakpoint A developer can run a system in debugging mode withoutsetting breakpoints but she must set a breakpoint to be able to stop theexecution step in and observe variable states Briefly there is no interactivedebugging session without at least one breakpoint set in the codeFinally some debuggers allow debugging remotely for example to performhot-fixes or to test mobile applications and systems operating in remote con-figurations

23 Self-organization and Swarm Intelligence

Self-organization is a concept emerged from Social Sciences and Biology and itis defined as the set of dynamic mechanisms enabling structures to appear atthe global level of a system from interactions among its lower-level componentswithout being explicitly coded at the lower levels Swarm intelligence (SI)describes the behavior resulting from the self-organization of social agents(as insects) [23] Ant nests and the societies that they house are examples ofSI [24] Individual ants can only perform relatively simple activities yet thewhole colony can collectively accomplish sophisticated activities Ants achieveSI by exchanging information encoded as chemical signalsmdashpheromones egindicating a path to follow or an obstacle to avoid

Similarly SI could be used as a metaphor to understand or explain thedevelopment of a multiversion large and complex software systems built bysoftware teams Individual developers can usually perform activities withouthaving a global understanding of the whole system [25] In a birdrsquos eye viewsoftware development is analogous to some SI in which groups of agents in-teracting locally with one another and with their environment and follow-ing simple rules lead to the emergence of global behaviors previously un-knownimpossible to the individual agents We claim that the similarities be-tween the SI of ant nests and complex software systems are not a coincidenceCockburn [26] suggested that the best architectures requirements and designs

8Please give a shorter version with authorrunning and titlerunning prior to maketitle

emerge from self-organizing developers growing in steps and following theirchanging knowledge and the changing wishes of the user community ie atypical example of swarm intelligence

Dev1

Dev2

Dev3

DevN

VisualisationsSearching Tools

Recommendation Systems

Single Debugging Session Crowd Debugging Sessions Debugging Information

Positive feedback

Collect data Store data

Transform information

A B C

D

Fig 2 Overview of the Swarm Debugging approach

24 Information Foraging

Information Foraging Theory (IFT) is based on the optimal foraging theorydeveloped by Pirolli and Card [27] to understand how people search for infor-mation IFT is rooted in biology studies and theories of how animals hunt forfood It was extended to debugging by Lawrance et al[27]

However no previous work proposes the sharing of knowledge related todebugging activities Differently from works that use IFT on a model onepreyone predator [28] we are interested in many developers working inde-pendently in many debugging sessions and sharing information to allow SI toemerge Thus debugging becomes a foraging process in a SI environment

These conceptsmdashSI and IFTmdashhave led to the design of a crowd approachapplied to debugging activities a different collective way of doing debuggingthat collects shares retrieves information from (previous and current) debug-ging sessions to support (current and future) debugging sessions

3 The Swarm Debugging Approach

Swarm Debugging (SD) uses swarm intelligence applied to interactive debug-ging data to create knowledge for supporting software development activitiesSwarm Debugging works as follows

Swarm Debugging the Collective Intelligence on Interactive Debugging 9

First several developers perform their individual independent debuggingactivities During these activities debugging events are collected by listeners(Label A in Figure 2) for example breakpoints-toggling and stepping events(Label B in Figure 2) that are then stored in a debugging-knowledge reposi-tory (Label C in Figure 2) For accessing this repository services are definedand implemented in the SDI For example stored events are processed bydedicated algorithms (Label D in Figure 2) (1) to create (several types of)visualizations (2) to offer (distinct ways of) searching and (3) to provide rec-ommendations to assist developers during debugging Recommendations arerelated to the locations where to toggle breakpoints Storing and using theseevents allow sharing developersrsquo knowledge among developers creating a col-lective intelligence about the software systems and their debugging

We chose to instrument the Eclipse IDE a popular IDE to implementSwarm Debugging and to reach a large number of users Also we use services inthe cloud to collect the debugging events to process these events and to providevisualizations and recommendations from these events Thus we decoupleddata collection from data usage allowing other researcherstools vendors touse the collected data

During debugging developers analyze the code toggling breakpoints andstepping in and through statements While traditional dynamic analysis ap-proaches collect all interactions states or events SD collects only invocationsexplicitly explored by developers SDI collects only visited areas and paths(chains of invocations by egStep Into or F5 in Eclipse IDE) and thus doesnot suffer from performance or memory issues as omniscient debuggers [29] ortracing-based approaches could

Our decision to record information about breakpoints and stepping is wellsupported by a study from Beller et al [30] A finding of this study is thatsetting breakpoints and stepping through code are the most used debuggingfeatures They showed that most of the recorded debugging events are relatedto the creation (4544) removal (4362) or adjustment of breakpoints hittingthem during debugging and stepping through the source code Furthermoreother advanced debugging features like defining watches and modifying vari-able values have been much less used [30]

4 SDI in a Nutshell

To evaluate the Swarm Debugging approach we have implemented the SwarmDebugging Infrastructure (see httpsgithubcomSwarmDebugging)The Swarm Debugging Infrastructure (SDI) [17] provides a set of tools forcollecting storing sharing retrieving and visualizing data collected duringdevelopersrsquo debugging activities The SDI is an Eclipse IDE11 plug-in inte-grated with Eclipse Debug core It is organized in three main modules (1) theSwarm Debugging Services (2) the Swarm Debugging Tracer and (3) Swarm

11 httpswwweclipseorg

10Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 3 GV elements - Types (nodes) invocations (edge) and Task filter area

Debugging Views All the implementation details of SDI are available in theAppendix section

41 Swarm Debugging Global View

Swarm Debugging Global View (GV) is a call graph for modeling softwarebased on directed call graph [31] to explicit the hierarchical relationship byinvocated methods This visualization use rounded gray boxes (Figure 3-A) torepresent types or classes (nodes) and oriented arrows (Figure 3-B) to expressinvocations (edges) GV is built using previous debugging session context datacollected by developers for different tasks

GV was implemented using CytoscapeJS [32] a Graph API JavaScriptframework applying an automatic layout manager breadthfirst As a web appli-cation the SD visualisations can be integrated into an Eclipse view as an SWTBrowser Widget or accessed through a traditional browser such as MozillaFirefox or Google Chrome

In this view the grey boxes are types that developers visited during debug-ging sessions The edges represent method calls (Step Into or F5 on Eclipse)performed by all developers in all traced tasks on a software project Eachedge colour represents a task and line thickness is proportional to the numberof invocations Each debugging session contributes with a context generat-ing the visualisation combining all collected invocations The visualisation isorganised in layers or stacks and each line is a layer of invocations The start-ing points (non-invoked methods) are allocated on top of a tree the adjacent

Swarm Debugging the Collective Intelligence on Interactive Debugging 11

Fig 4 GV on all tasks

nodes in an invocation sequence Besides developers can directly go to a typein the Eclipse Editor by double-clicking over a node in the diagram In the leftcorner developers can use radio buttons to filter invocations by task (figure 3-C) showing the paths used by developers during previous debugging sessionsby a task Finally developers can use the mouse to pan and zoom inout onthe visualisation Figure 4 shows an example of GV with all tasks for JabRefsystem and we have data about 8 tasks

GV is a contextual visualization that shows only the paths explicitlyand intentionally visited by developers including type declarations andmethod invocations explored by developers based on their decisions

5 Using SDI to Understand Debugging Activities

The first benefit of SDI is the fact that it allows for collecting detailed in-formation about debugging sessions Using this information researchers caninvestigate developers behaviors during debugging activities To illustrate thispoint we conducted two experiments using SDI to understand developers de-bugging habits the times and effort with which they set breakpoints and thelocations where they set breakpoints

Our analysis builds upon three independent sets of observations involvingin total three systems Studies 1 and 2 involved JabRef PDFSaM and Raptoras subject systems We analysed 45 video-recorded debugging sessions avail-able from our own collected videos (Study 1) and an empirical study performedby Jiang et al [33] (Study 2)

In this study we answered the following research questions

RQ1 Is there a correlation between the time of the first breakpoint and a de-bugging taskrsquos elapsed time

RQ2 What is the effort in time for setting the first breakpoint in relation to thedebugging taskrsquos elapsed time

12Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

In this section we elaborate more on each of the studies

51 Study 1 Observational Study on JabRef

511 Subject System

To conduct this first study we selected JabRef12 version 32 as subject sys-tem This choice was motivated by the fact that JabRefrsquos domain is easy tounderstand thus reducing any learning effect It is composed of relatively inde-pendent packages and classes ie high cohesion low coupling thus reducingthe potential commingle effect of low code quality

512 Participants

We recruited eight male professional developers via an Internet-based free-lancer service13 Two participants are experts and three are intermediate inJava Developers self-reported their expertise levels which thus should betaken with caution Also we recruited 12 undergraduate and graduate stu-dents at Polytechnique Montreal to participate in our study We surveyedall the participantsrsquo background information before the study14 The surveyincluded questions about participantsrsquo self-assessment on their level of pro-gramming expertise (Java IDE and Eclipse) gender first natural languageschooling level and knowledge about TDD interactive debugging and whyusually they use a debugger All participants stated that they had experiencein Java and worked regularly with the debugger of Eclipse

513 Task Description

We selected five defects reported in the issue-tracking system of JabRef Wechose the task of fixing the faults that would potentially require developers toset breakpoints in different Java classes To ensure this we manually conductedthe debugging ourselves and verified that for understanding the root causeof the faults we had to set at least two breakpoints during our interactivedebugging sessions Then we asked participants to find the locations of thefaults described in Issues 318 667 669 993 and 1026 Table 1 summarisesthe faults using their titles from the issue-tracking system

12 httpwwwjabreforg13 httpswwwfreelancercom14 Survey available on httpsgooglformsdxCQaBke2l2cqjB42

Swarm Debugging the Collective Intelligence on Interactive Debugging 13

Table 1 Summary of the issues considered in JabRef in Study 1

Issues Summaries

318 ldquoNormalize to Bibtex name formatrdquo

667 ldquohashpound sign causes URL link to failrdquo

669 ldquoJabRef 3132 writes bib file in a format

that it will not readrdquo

993 ldquoIssues in BibTeX source opens save dialog

and opens dialog Problem with parsing entryrsquo

multiple timesrdquo

1026 ldquoJabref removes comments

inside the Bibtex coderdquo

514 Artifacts and Working Environment

We provided the participants with a tutorial15 explaining how to install andconfigure the tools required for the study and how to use them through awarm-up task We also presented a video16 to guide the participants during thewarm-up task In a second document we described the five faults and the stepsto reproduce them We also provided participants with a video demonstratingstep-by-step how to reproduce the five defects to help them get started

We provided a pre-configured Eclipse workspace to the participants andasked them to install Java 8 Eclipse Mars 2 with the Swarm Debugging Tracerplug-in [17] to collect automatically breakpoint-related events The Eclipseworkspace contained two Java projects a Tetris game for the warm-up taskand JabRef v32 for the study We also required that the participants installand configure the Open Broadcaster Software17 (OBS) open-source softwarefor live streaming and recording We used the OBS to record the participantsrsquoscreens

515 Study Procedure

After installing their environments we asked participants to perform a warm-up task with a Tetris game The task consisted of starting a debugging sessionsetting a breakpoint and debugging the Tetris program to locate a givenmethod We used this task to confirm that the participantsrsquo environmentswere properly configured and also to accustom the participants with the studysettings It was a trivial task that we also used to filter the participants whowould have too little knowledge of Java Eclipse and Eclipse Java debugger

15 httpswarmdebuggingorgpublication16 httpsyoutubeU1sBMpfL2jc17 httpsobsprojectcom

14Please give a shorter version with authorrunning and titlerunning prior to maketitle

All participants who participated in our study correctly executed the warm-uptask

After performing the warm-up task each participant performed debuggingto locate the faults We established a maximum limit of one-hour per task andinformed the participants that the task would require about 20 minutes foreach fault which we will discuss as a possible threat to validity We based thislimit on previous experiences with these tasks during mock trials After theparticipants performed each task we asked them to answer a post-experimentquestionnaire to collect information about the study asking if they found thefaults where were the faults why the faults happened if they were tired anda general summary of their debugging experience

516 Data Collection

The Swarm Debugging Tracer plug-in automatically and transparently col-lected all debugging data (breakpoints stepping method invocations) Alsowe recorded the participantrsquos screens during their debugging sessions withOBS We collected the following data

ndash 28 video recordings one per participant and task which are essential tocontrol the quality of each session and to produce a reliable and repro-ducible chain of evidence for our results

ndash The statements (lines in the source code) where the participants set break-points We considered the following types of statements because they arerepresentative of the main concepts in any programming languagesndash call methodfunction invocationsndash return returns of valuesndash assignment settings of valuesndash if-statement conditional statementsndash while-loop loops iterations

ndash Summaries of the results of the study one per participant via a question-naire which included the following questionsndash Did you locate the faultndash Where was the faultndash Why did the fault happenndash Were you tiredndash How was your debugging experience

Based on this data we obtained or computed the following metrics perparticipant and task

ndash Start Time (ST ) the timestamp when the participant started a task Weanalysed each video and we started to count when effectively the partic-ipant started a task ie when she started the Swarm Debugging Tracerplug-in for example

ndash Time of First Breakpoint (FB) the time when the participant set her firstbreakpoint

ndash End time (T ) the time when the participant finished a task

Swarm Debugging the Collective Intelligence on Interactive Debugging 15

ndash Elapsed End time (ET ) ET = T minus STndash Elapsed Time First Breakpoint (EF ) EF = FB minus ST

We manually verified whether participants were successful or not at com-pleting their tasks by analysing the answers provided in the questionnaireand the videos We knew the locations of the faults because all tasks weresolved by JabRefrsquos developers who completed the corresponding reports inthe issue-tracking system with the changes that they made

52 Study 2 Empirical Study on PDFSaM and Raptor

The second study consisted of the re-analysis of 20 videos of debugging sessionsavailable from an empirical study on change-impact analysis with professionaldevelopers [33] The authors conducted their work in two phases In the firstphase they asked nine developers to read two fault reports from two open-source systems and to fix these faults The objective was to observe the devel-opersrsquo behaviour as they fixed the faults In the second phase they analysedthe developersrsquo behaviour to determine whether the developers used any toolsfor change-impact analysis and if not whether they performed change-impactanalysis manually

The two systems analysed in their study are PDF Split and Merge18 (PDF-SaM) and Raptor19 They chose one fault report per system for their studyThey chose these systems due to their non-trivial size and because the pur-poses and domains of these systems were clear and easy to understand [33]The choice of the fault reports followed the criteria that they were alreadysolved and that they could be understood by developers who did not knowthe systems Alongside each fault report they presented the developers withinformation about the systems their purpose their main entry points andinstructions for replicating the faults

53 Results

As can be noticed Studies 1 and 2 have different approaches The tasks inStudy 1 were fault location tasks developers did not correct the faults whilethe ones in Study 2 were fault correction tasks Moreover Study 1 exploredfive different faults while Study 2 only analysed one fault per system Thecollected data provide a diversity of cases and allow a rich in-depth view ofhow developers set breakpoints during different debugging sessions

In the following we present the results regarding each research questionaddressed in the two studies

18 httpwwwpdfsamorg19 httpscodegooglecompraptor-chess-interface

16Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

MFB =EF

ET(1)

Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

Tasks Average Times (min) Std Devs (min)

318 44 64

667 28 29

669 22 25

993 25 25

1026 25 17

PdfSam 54 18

Raptor 59 13

Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

Swarm Debugging the Collective Intelligence on Interactive Debugging 17

RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

f(x) =α

xβ(2)

where α = 12 and β = 044

Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

This finding also corroborates previous results found with a different set oftasks [17]

18Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

Table 3 Study 1 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 111 53

if-statement 39 19

assignment 36 17

return 18 10

while-loop 3 1

Table 4 Study 2 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 43 43

if-statement 22 22

assignment 27 27

return 4 4

while-loop 4 4

13

Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

Swarm Debugging the Collective Intelligence on Interactive Debugging 19

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

JabRefDesktopopenExternalViewer(metaData()

linktoString() field)

Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

Tasks Classes Lines of Code Breakpoints

0318 AuthorsFormatter 43 5

0318 AuthorsFormatter 131 3

0667 BasePanel 935 2

0667 BasePanel 969 3

0667 JabRefDesktop 430 2

0669 OpenDatabaseAction 268 2

0669 OpenDatabaseAction 433 4

0669 OpenDatabaseAction 451 4

0993 EntryEditor 717 2

0993 EntryEditor 720 2

0993 EntryEditor 723 2

0993 BibDatabase 187 2

0993 BibDatabase 456 2

1026 EntryEditor 1184 2

1026 BibtexParser 160 2

20Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

Classes Lines of Code Breakpoints

PdfReader 230 2

PdfReader 806 2

PdfReader 1923 2

ConsoleServicesFacade 89 2

ConsoleClient 81 2

PdfUtility 94 2

PdfUtility 96 2

PdfUtility 102 2

Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

Classes Lines of Code Breakpoints

icsUtils 333 3

Game 1751 2

ExamineController 41 2

ExamineController 84 3

ExamineController 87 2

ExamineController 92 2

When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

Swarm Debugging the Collective Intelligence on Interactive Debugging 21

Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

Classes Lines of Code Breakpoints

BibtexParser 138151159 222

160165168 323

176198199299 2222

EntryEditor 717720721 342

723837842 232

11841393 32

BibDatabase 175187223456 2326

OpenDatabaseAction 433450451 424

JabRefDesktop 4084430 223

SaveDatabaseAction 177188 42

BasePanel 935969 25

AuthorsFormatter 43131 54

EntryTableTransferHandler 346 2

FieldTextMenu 84 2

JabRefFrame 1119 2

JabRefMain 8 5

URLUtil 95 2

Fig 6 Methods with 5 or more breakpoints

Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

22Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 9 Study 1 - Breakpoints by class across different tasks

Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

SaveDatabaseAction Yes Yes Yes 7 2

BasePanel Yes Yes Yes Yes 14 7

JabRefDesktop Yes Yes 9 4

EntryEditor Yes Yes Yes 36 4

BibtexParser Yes Yes Yes 44 6

OpenDatabaseAction Yes Yes Yes 19 13

JabRef Yes Yes Yes 3 3

JabRefMain Yes Yes Yes Yes 5 4

URLUtil Yes Yes 4 2

BibDatabase Yes Yes Yes 19 4

points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

6 Evaluation of Swarm Debugging using GV

To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

Swarm Debugging the Collective Intelligence on Interactive Debugging 23

61 Study design

The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

611 Subject System

For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

612 Participants

Fig 7 Java expertise

To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

20 httpwwwjabreforg21 httpswwwfreelancercom

24Please give a shorter version with authorrunning and titlerunning prior to maketitle

613 Task Description

We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

614 Artifacts and Working Environment

After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

615 Study Procedure

The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

22 The full qualitative evaluation survey is available on httpsgooglforms

c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

Swarm Debugging the Collective Intelligence on Interactive Debugging 25

group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

616 Data Collection

In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

62 Results

We now discuss the results of our evaluation

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

26Please give a shorter version with authorrunning and titlerunning prior to maketitle

number of participants who could propose a solution and the correctness ofthe solutions

For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

Fig 8 GV for Task 0318

For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

Fig 9 GV for Task 0667

Swarm Debugging the Collective Intelligence on Interactive Debugging 27

Fig 10 GV for Task 0669

Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

13

Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

28Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 11 GV usefulness - experimental phase one

Fig 12 GV usefulness - experimental phase two

The analysis of our results suggests that GV is useful to support software-maintenance tasks

Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

Swarm Debugging the Collective Intelligence on Interactive Debugging 29

Table 10 Results from control and experimental groups (average)

Task 0993

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000255 000340 -44 126

Time to start 000444 000518 -33 112

Elapsed time 003008 001605 843 53

Task 1026

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000242 000448 -126 177

Time to start 000402 000343 19 92

Elapsed time 002458 002041 257 83

63 Comparing Results from the Control and Experimental Groups

We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 6: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

6Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 1 Setting a static breakpoint (A) and a conditional breakpoint (B)using Eclipse IDE

ndash GUI Most IDEs or browsers offer a visual way of adding a breakpoint usu-ally by clicking at the beginning of the line on which to set the breakpointChrome1 Visual Studio2 IntelliJ 3 and Xcode4

ndash Command line Some programming languages offer debugging tools on thecommand line so an IDE is not necessary to debug the code JDB5 PDB6and GDB7

ndash Code Some programming languages allow using syntactical elements to setbreakpoints as they were lsquoannotationsrsquo in the code This approach oftenonly supports the setting of a breakpoint and it is necessary to use itin conjunction with the command line or GUI Some examples are Rubydebugger8 Firefox 9 and Chrome10

There is a set of features in a debugger that allows developers to control theflow of the execution within the breakpoints ie Call Stack features whichenable continuing or stepping

A developer can opt for continuing in which case the debugger resumesexecution until the next breakpoint is reached or the program exits Con-versely stepping allows the developer to run step by step the entire program

1httpsdevelopersgooglecomwebtoolschrome-devtoolsjavascriptadd-breakpoints

2httpsmsdnmicrosoftcomen-uslibrary5557y8b4aspx

3httpswwwjetbrainscomhelpidea20163debugger-basicshtml

4httpjeffreysambellscom20140114using-breakpoints-in-xcode

5httpdocsoraclecomjavase7docstechnotestoolswindowsjdbhtml

6httpsdocspythonorg2librarypdbhtml

7ftpftpgnuorgoldgnuManualsgdb511html nodegdb 37html

8httpsgithubcomcldwalkerdebugger

9httpsdevelopermozillaorgpt-BRdocsWebJavaScriptReferenceStatementsdebugger

10httpsdevelopersgooglecomwebtoolschrome-devtoolsjavascriptadd-breakpoints

Swarm Debugging the Collective Intelligence on Interactive Debugging 7

flow The definition of a step varies across programming languages and debug-gers but it generally includes invoking a method and executing a statementWhile Stepping a developer can navigate between steps using the followingcommands

ndash Step Over the debugger steps over a given line If the line contains afunction then the function is executed and the result returned withoutstepping through each of its lines

ndash Step Into the debugger enters the function at the current line and continuestepping from there line-by-line

ndash Step Out this action would take the debugger back to the line where thecurrent function was called

To start an interactive debugging session developers set a breakpoint Ifnot the IDE would not stop and enter its interactive mode For exampleEclipse IDE automatically opens the ldquoDebugging Perspectiverdquo when executionhits a breakpoint A developer can run a system in debugging mode withoutsetting breakpoints but she must set a breakpoint to be able to stop theexecution step in and observe variable states Briefly there is no interactivedebugging session without at least one breakpoint set in the codeFinally some debuggers allow debugging remotely for example to performhot-fixes or to test mobile applications and systems operating in remote con-figurations

23 Self-organization and Swarm Intelligence

Self-organization is a concept emerged from Social Sciences and Biology and itis defined as the set of dynamic mechanisms enabling structures to appear atthe global level of a system from interactions among its lower-level componentswithout being explicitly coded at the lower levels Swarm intelligence (SI)describes the behavior resulting from the self-organization of social agents(as insects) [23] Ant nests and the societies that they house are examples ofSI [24] Individual ants can only perform relatively simple activities yet thewhole colony can collectively accomplish sophisticated activities Ants achieveSI by exchanging information encoded as chemical signalsmdashpheromones egindicating a path to follow or an obstacle to avoid

Similarly SI could be used as a metaphor to understand or explain thedevelopment of a multiversion large and complex software systems built bysoftware teams Individual developers can usually perform activities withouthaving a global understanding of the whole system [25] In a birdrsquos eye viewsoftware development is analogous to some SI in which groups of agents in-teracting locally with one another and with their environment and follow-ing simple rules lead to the emergence of global behaviors previously un-knownimpossible to the individual agents We claim that the similarities be-tween the SI of ant nests and complex software systems are not a coincidenceCockburn [26] suggested that the best architectures requirements and designs

8Please give a shorter version with authorrunning and titlerunning prior to maketitle

emerge from self-organizing developers growing in steps and following theirchanging knowledge and the changing wishes of the user community ie atypical example of swarm intelligence

Dev1

Dev2

Dev3

DevN

VisualisationsSearching Tools

Recommendation Systems

Single Debugging Session Crowd Debugging Sessions Debugging Information

Positive feedback

Collect data Store data

Transform information

A B C

D

Fig 2 Overview of the Swarm Debugging approach

24 Information Foraging

Information Foraging Theory (IFT) is based on the optimal foraging theorydeveloped by Pirolli and Card [27] to understand how people search for infor-mation IFT is rooted in biology studies and theories of how animals hunt forfood It was extended to debugging by Lawrance et al[27]

However no previous work proposes the sharing of knowledge related todebugging activities Differently from works that use IFT on a model onepreyone predator [28] we are interested in many developers working inde-pendently in many debugging sessions and sharing information to allow SI toemerge Thus debugging becomes a foraging process in a SI environment

These conceptsmdashSI and IFTmdashhave led to the design of a crowd approachapplied to debugging activities a different collective way of doing debuggingthat collects shares retrieves information from (previous and current) debug-ging sessions to support (current and future) debugging sessions

3 The Swarm Debugging Approach

Swarm Debugging (SD) uses swarm intelligence applied to interactive debug-ging data to create knowledge for supporting software development activitiesSwarm Debugging works as follows

Swarm Debugging the Collective Intelligence on Interactive Debugging 9

First several developers perform their individual independent debuggingactivities During these activities debugging events are collected by listeners(Label A in Figure 2) for example breakpoints-toggling and stepping events(Label B in Figure 2) that are then stored in a debugging-knowledge reposi-tory (Label C in Figure 2) For accessing this repository services are definedand implemented in the SDI For example stored events are processed bydedicated algorithms (Label D in Figure 2) (1) to create (several types of)visualizations (2) to offer (distinct ways of) searching and (3) to provide rec-ommendations to assist developers during debugging Recommendations arerelated to the locations where to toggle breakpoints Storing and using theseevents allow sharing developersrsquo knowledge among developers creating a col-lective intelligence about the software systems and their debugging

We chose to instrument the Eclipse IDE a popular IDE to implementSwarm Debugging and to reach a large number of users Also we use services inthe cloud to collect the debugging events to process these events and to providevisualizations and recommendations from these events Thus we decoupleddata collection from data usage allowing other researcherstools vendors touse the collected data

During debugging developers analyze the code toggling breakpoints andstepping in and through statements While traditional dynamic analysis ap-proaches collect all interactions states or events SD collects only invocationsexplicitly explored by developers SDI collects only visited areas and paths(chains of invocations by egStep Into or F5 in Eclipse IDE) and thus doesnot suffer from performance or memory issues as omniscient debuggers [29] ortracing-based approaches could

Our decision to record information about breakpoints and stepping is wellsupported by a study from Beller et al [30] A finding of this study is thatsetting breakpoints and stepping through code are the most used debuggingfeatures They showed that most of the recorded debugging events are relatedto the creation (4544) removal (4362) or adjustment of breakpoints hittingthem during debugging and stepping through the source code Furthermoreother advanced debugging features like defining watches and modifying vari-able values have been much less used [30]

4 SDI in a Nutshell

To evaluate the Swarm Debugging approach we have implemented the SwarmDebugging Infrastructure (see httpsgithubcomSwarmDebugging)The Swarm Debugging Infrastructure (SDI) [17] provides a set of tools forcollecting storing sharing retrieving and visualizing data collected duringdevelopersrsquo debugging activities The SDI is an Eclipse IDE11 plug-in inte-grated with Eclipse Debug core It is organized in three main modules (1) theSwarm Debugging Services (2) the Swarm Debugging Tracer and (3) Swarm

11 httpswwweclipseorg

10Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 3 GV elements - Types (nodes) invocations (edge) and Task filter area

Debugging Views All the implementation details of SDI are available in theAppendix section

41 Swarm Debugging Global View

Swarm Debugging Global View (GV) is a call graph for modeling softwarebased on directed call graph [31] to explicit the hierarchical relationship byinvocated methods This visualization use rounded gray boxes (Figure 3-A) torepresent types or classes (nodes) and oriented arrows (Figure 3-B) to expressinvocations (edges) GV is built using previous debugging session context datacollected by developers for different tasks

GV was implemented using CytoscapeJS [32] a Graph API JavaScriptframework applying an automatic layout manager breadthfirst As a web appli-cation the SD visualisations can be integrated into an Eclipse view as an SWTBrowser Widget or accessed through a traditional browser such as MozillaFirefox or Google Chrome

In this view the grey boxes are types that developers visited during debug-ging sessions The edges represent method calls (Step Into or F5 on Eclipse)performed by all developers in all traced tasks on a software project Eachedge colour represents a task and line thickness is proportional to the numberof invocations Each debugging session contributes with a context generat-ing the visualisation combining all collected invocations The visualisation isorganised in layers or stacks and each line is a layer of invocations The start-ing points (non-invoked methods) are allocated on top of a tree the adjacent

Swarm Debugging the Collective Intelligence on Interactive Debugging 11

Fig 4 GV on all tasks

nodes in an invocation sequence Besides developers can directly go to a typein the Eclipse Editor by double-clicking over a node in the diagram In the leftcorner developers can use radio buttons to filter invocations by task (figure 3-C) showing the paths used by developers during previous debugging sessionsby a task Finally developers can use the mouse to pan and zoom inout onthe visualisation Figure 4 shows an example of GV with all tasks for JabRefsystem and we have data about 8 tasks

GV is a contextual visualization that shows only the paths explicitlyand intentionally visited by developers including type declarations andmethod invocations explored by developers based on their decisions

5 Using SDI to Understand Debugging Activities

The first benefit of SDI is the fact that it allows for collecting detailed in-formation about debugging sessions Using this information researchers caninvestigate developers behaviors during debugging activities To illustrate thispoint we conducted two experiments using SDI to understand developers de-bugging habits the times and effort with which they set breakpoints and thelocations where they set breakpoints

Our analysis builds upon three independent sets of observations involvingin total three systems Studies 1 and 2 involved JabRef PDFSaM and Raptoras subject systems We analysed 45 video-recorded debugging sessions avail-able from our own collected videos (Study 1) and an empirical study performedby Jiang et al [33] (Study 2)

In this study we answered the following research questions

RQ1 Is there a correlation between the time of the first breakpoint and a de-bugging taskrsquos elapsed time

RQ2 What is the effort in time for setting the first breakpoint in relation to thedebugging taskrsquos elapsed time

12Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

In this section we elaborate more on each of the studies

51 Study 1 Observational Study on JabRef

511 Subject System

To conduct this first study we selected JabRef12 version 32 as subject sys-tem This choice was motivated by the fact that JabRefrsquos domain is easy tounderstand thus reducing any learning effect It is composed of relatively inde-pendent packages and classes ie high cohesion low coupling thus reducingthe potential commingle effect of low code quality

512 Participants

We recruited eight male professional developers via an Internet-based free-lancer service13 Two participants are experts and three are intermediate inJava Developers self-reported their expertise levels which thus should betaken with caution Also we recruited 12 undergraduate and graduate stu-dents at Polytechnique Montreal to participate in our study We surveyedall the participantsrsquo background information before the study14 The surveyincluded questions about participantsrsquo self-assessment on their level of pro-gramming expertise (Java IDE and Eclipse) gender first natural languageschooling level and knowledge about TDD interactive debugging and whyusually they use a debugger All participants stated that they had experiencein Java and worked regularly with the debugger of Eclipse

513 Task Description

We selected five defects reported in the issue-tracking system of JabRef Wechose the task of fixing the faults that would potentially require developers toset breakpoints in different Java classes To ensure this we manually conductedthe debugging ourselves and verified that for understanding the root causeof the faults we had to set at least two breakpoints during our interactivedebugging sessions Then we asked participants to find the locations of thefaults described in Issues 318 667 669 993 and 1026 Table 1 summarisesthe faults using their titles from the issue-tracking system

12 httpwwwjabreforg13 httpswwwfreelancercom14 Survey available on httpsgooglformsdxCQaBke2l2cqjB42

Swarm Debugging the Collective Intelligence on Interactive Debugging 13

Table 1 Summary of the issues considered in JabRef in Study 1

Issues Summaries

318 ldquoNormalize to Bibtex name formatrdquo

667 ldquohashpound sign causes URL link to failrdquo

669 ldquoJabRef 3132 writes bib file in a format

that it will not readrdquo

993 ldquoIssues in BibTeX source opens save dialog

and opens dialog Problem with parsing entryrsquo

multiple timesrdquo

1026 ldquoJabref removes comments

inside the Bibtex coderdquo

514 Artifacts and Working Environment

We provided the participants with a tutorial15 explaining how to install andconfigure the tools required for the study and how to use them through awarm-up task We also presented a video16 to guide the participants during thewarm-up task In a second document we described the five faults and the stepsto reproduce them We also provided participants with a video demonstratingstep-by-step how to reproduce the five defects to help them get started

We provided a pre-configured Eclipse workspace to the participants andasked them to install Java 8 Eclipse Mars 2 with the Swarm Debugging Tracerplug-in [17] to collect automatically breakpoint-related events The Eclipseworkspace contained two Java projects a Tetris game for the warm-up taskand JabRef v32 for the study We also required that the participants installand configure the Open Broadcaster Software17 (OBS) open-source softwarefor live streaming and recording We used the OBS to record the participantsrsquoscreens

515 Study Procedure

After installing their environments we asked participants to perform a warm-up task with a Tetris game The task consisted of starting a debugging sessionsetting a breakpoint and debugging the Tetris program to locate a givenmethod We used this task to confirm that the participantsrsquo environmentswere properly configured and also to accustom the participants with the studysettings It was a trivial task that we also used to filter the participants whowould have too little knowledge of Java Eclipse and Eclipse Java debugger

15 httpswarmdebuggingorgpublication16 httpsyoutubeU1sBMpfL2jc17 httpsobsprojectcom

14Please give a shorter version with authorrunning and titlerunning prior to maketitle

All participants who participated in our study correctly executed the warm-uptask

After performing the warm-up task each participant performed debuggingto locate the faults We established a maximum limit of one-hour per task andinformed the participants that the task would require about 20 minutes foreach fault which we will discuss as a possible threat to validity We based thislimit on previous experiences with these tasks during mock trials After theparticipants performed each task we asked them to answer a post-experimentquestionnaire to collect information about the study asking if they found thefaults where were the faults why the faults happened if they were tired anda general summary of their debugging experience

516 Data Collection

The Swarm Debugging Tracer plug-in automatically and transparently col-lected all debugging data (breakpoints stepping method invocations) Alsowe recorded the participantrsquos screens during their debugging sessions withOBS We collected the following data

ndash 28 video recordings one per participant and task which are essential tocontrol the quality of each session and to produce a reliable and repro-ducible chain of evidence for our results

ndash The statements (lines in the source code) where the participants set break-points We considered the following types of statements because they arerepresentative of the main concepts in any programming languagesndash call methodfunction invocationsndash return returns of valuesndash assignment settings of valuesndash if-statement conditional statementsndash while-loop loops iterations

ndash Summaries of the results of the study one per participant via a question-naire which included the following questionsndash Did you locate the faultndash Where was the faultndash Why did the fault happenndash Were you tiredndash How was your debugging experience

Based on this data we obtained or computed the following metrics perparticipant and task

ndash Start Time (ST ) the timestamp when the participant started a task Weanalysed each video and we started to count when effectively the partic-ipant started a task ie when she started the Swarm Debugging Tracerplug-in for example

ndash Time of First Breakpoint (FB) the time when the participant set her firstbreakpoint

ndash End time (T ) the time when the participant finished a task

Swarm Debugging the Collective Intelligence on Interactive Debugging 15

ndash Elapsed End time (ET ) ET = T minus STndash Elapsed Time First Breakpoint (EF ) EF = FB minus ST

We manually verified whether participants were successful or not at com-pleting their tasks by analysing the answers provided in the questionnaireand the videos We knew the locations of the faults because all tasks weresolved by JabRefrsquos developers who completed the corresponding reports inthe issue-tracking system with the changes that they made

52 Study 2 Empirical Study on PDFSaM and Raptor

The second study consisted of the re-analysis of 20 videos of debugging sessionsavailable from an empirical study on change-impact analysis with professionaldevelopers [33] The authors conducted their work in two phases In the firstphase they asked nine developers to read two fault reports from two open-source systems and to fix these faults The objective was to observe the devel-opersrsquo behaviour as they fixed the faults In the second phase they analysedthe developersrsquo behaviour to determine whether the developers used any toolsfor change-impact analysis and if not whether they performed change-impactanalysis manually

The two systems analysed in their study are PDF Split and Merge18 (PDF-SaM) and Raptor19 They chose one fault report per system for their studyThey chose these systems due to their non-trivial size and because the pur-poses and domains of these systems were clear and easy to understand [33]The choice of the fault reports followed the criteria that they were alreadysolved and that they could be understood by developers who did not knowthe systems Alongside each fault report they presented the developers withinformation about the systems their purpose their main entry points andinstructions for replicating the faults

53 Results

As can be noticed Studies 1 and 2 have different approaches The tasks inStudy 1 were fault location tasks developers did not correct the faults whilethe ones in Study 2 were fault correction tasks Moreover Study 1 exploredfive different faults while Study 2 only analysed one fault per system Thecollected data provide a diversity of cases and allow a rich in-depth view ofhow developers set breakpoints during different debugging sessions

In the following we present the results regarding each research questionaddressed in the two studies

18 httpwwwpdfsamorg19 httpscodegooglecompraptor-chess-interface

16Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

MFB =EF

ET(1)

Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

Tasks Average Times (min) Std Devs (min)

318 44 64

667 28 29

669 22 25

993 25 25

1026 25 17

PdfSam 54 18

Raptor 59 13

Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

Swarm Debugging the Collective Intelligence on Interactive Debugging 17

RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

f(x) =α

xβ(2)

where α = 12 and β = 044

Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

This finding also corroborates previous results found with a different set oftasks [17]

18Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

Table 3 Study 1 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 111 53

if-statement 39 19

assignment 36 17

return 18 10

while-loop 3 1

Table 4 Study 2 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 43 43

if-statement 22 22

assignment 27 27

return 4 4

while-loop 4 4

13

Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

Swarm Debugging the Collective Intelligence on Interactive Debugging 19

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

JabRefDesktopopenExternalViewer(metaData()

linktoString() field)

Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

Tasks Classes Lines of Code Breakpoints

0318 AuthorsFormatter 43 5

0318 AuthorsFormatter 131 3

0667 BasePanel 935 2

0667 BasePanel 969 3

0667 JabRefDesktop 430 2

0669 OpenDatabaseAction 268 2

0669 OpenDatabaseAction 433 4

0669 OpenDatabaseAction 451 4

0993 EntryEditor 717 2

0993 EntryEditor 720 2

0993 EntryEditor 723 2

0993 BibDatabase 187 2

0993 BibDatabase 456 2

1026 EntryEditor 1184 2

1026 BibtexParser 160 2

20Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

Classes Lines of Code Breakpoints

PdfReader 230 2

PdfReader 806 2

PdfReader 1923 2

ConsoleServicesFacade 89 2

ConsoleClient 81 2

PdfUtility 94 2

PdfUtility 96 2

PdfUtility 102 2

Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

Classes Lines of Code Breakpoints

icsUtils 333 3

Game 1751 2

ExamineController 41 2

ExamineController 84 3

ExamineController 87 2

ExamineController 92 2

When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

Swarm Debugging the Collective Intelligence on Interactive Debugging 21

Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

Classes Lines of Code Breakpoints

BibtexParser 138151159 222

160165168 323

176198199299 2222

EntryEditor 717720721 342

723837842 232

11841393 32

BibDatabase 175187223456 2326

OpenDatabaseAction 433450451 424

JabRefDesktop 4084430 223

SaveDatabaseAction 177188 42

BasePanel 935969 25

AuthorsFormatter 43131 54

EntryTableTransferHandler 346 2

FieldTextMenu 84 2

JabRefFrame 1119 2

JabRefMain 8 5

URLUtil 95 2

Fig 6 Methods with 5 or more breakpoints

Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

22Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 9 Study 1 - Breakpoints by class across different tasks

Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

SaveDatabaseAction Yes Yes Yes 7 2

BasePanel Yes Yes Yes Yes 14 7

JabRefDesktop Yes Yes 9 4

EntryEditor Yes Yes Yes 36 4

BibtexParser Yes Yes Yes 44 6

OpenDatabaseAction Yes Yes Yes 19 13

JabRef Yes Yes Yes 3 3

JabRefMain Yes Yes Yes Yes 5 4

URLUtil Yes Yes 4 2

BibDatabase Yes Yes Yes 19 4

points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

6 Evaluation of Swarm Debugging using GV

To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

Swarm Debugging the Collective Intelligence on Interactive Debugging 23

61 Study design

The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

611 Subject System

For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

612 Participants

Fig 7 Java expertise

To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

20 httpwwwjabreforg21 httpswwwfreelancercom

24Please give a shorter version with authorrunning and titlerunning prior to maketitle

613 Task Description

We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

614 Artifacts and Working Environment

After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

615 Study Procedure

The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

22 The full qualitative evaluation survey is available on httpsgooglforms

c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

Swarm Debugging the Collective Intelligence on Interactive Debugging 25

group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

616 Data Collection

In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

62 Results

We now discuss the results of our evaluation

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

26Please give a shorter version with authorrunning and titlerunning prior to maketitle

number of participants who could propose a solution and the correctness ofthe solutions

For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

Fig 8 GV for Task 0318

For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

Fig 9 GV for Task 0667

Swarm Debugging the Collective Intelligence on Interactive Debugging 27

Fig 10 GV for Task 0669

Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

13

Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

28Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 11 GV usefulness - experimental phase one

Fig 12 GV usefulness - experimental phase two

The analysis of our results suggests that GV is useful to support software-maintenance tasks

Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

Swarm Debugging the Collective Intelligence on Interactive Debugging 29

Table 10 Results from control and experimental groups (average)

Task 0993

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000255 000340 -44 126

Time to start 000444 000518 -33 112

Elapsed time 003008 001605 843 53

Task 1026

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000242 000448 -126 177

Time to start 000402 000343 19 92

Elapsed time 002458 002041 257 83

63 Comparing Results from the Control and Experimental Groups

We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 7: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

Swarm Debugging the Collective Intelligence on Interactive Debugging 7

flow The definition of a step varies across programming languages and debug-gers but it generally includes invoking a method and executing a statementWhile Stepping a developer can navigate between steps using the followingcommands

ndash Step Over the debugger steps over a given line If the line contains afunction then the function is executed and the result returned withoutstepping through each of its lines

ndash Step Into the debugger enters the function at the current line and continuestepping from there line-by-line

ndash Step Out this action would take the debugger back to the line where thecurrent function was called

To start an interactive debugging session developers set a breakpoint Ifnot the IDE would not stop and enter its interactive mode For exampleEclipse IDE automatically opens the ldquoDebugging Perspectiverdquo when executionhits a breakpoint A developer can run a system in debugging mode withoutsetting breakpoints but she must set a breakpoint to be able to stop theexecution step in and observe variable states Briefly there is no interactivedebugging session without at least one breakpoint set in the codeFinally some debuggers allow debugging remotely for example to performhot-fixes or to test mobile applications and systems operating in remote con-figurations

23 Self-organization and Swarm Intelligence

Self-organization is a concept emerged from Social Sciences and Biology and itis defined as the set of dynamic mechanisms enabling structures to appear atthe global level of a system from interactions among its lower-level componentswithout being explicitly coded at the lower levels Swarm intelligence (SI)describes the behavior resulting from the self-organization of social agents(as insects) [23] Ant nests and the societies that they house are examples ofSI [24] Individual ants can only perform relatively simple activities yet thewhole colony can collectively accomplish sophisticated activities Ants achieveSI by exchanging information encoded as chemical signalsmdashpheromones egindicating a path to follow or an obstacle to avoid

Similarly SI could be used as a metaphor to understand or explain thedevelopment of a multiversion large and complex software systems built bysoftware teams Individual developers can usually perform activities withouthaving a global understanding of the whole system [25] In a birdrsquos eye viewsoftware development is analogous to some SI in which groups of agents in-teracting locally with one another and with their environment and follow-ing simple rules lead to the emergence of global behaviors previously un-knownimpossible to the individual agents We claim that the similarities be-tween the SI of ant nests and complex software systems are not a coincidenceCockburn [26] suggested that the best architectures requirements and designs

8Please give a shorter version with authorrunning and titlerunning prior to maketitle

emerge from self-organizing developers growing in steps and following theirchanging knowledge and the changing wishes of the user community ie atypical example of swarm intelligence

Dev1

Dev2

Dev3

DevN

VisualisationsSearching Tools

Recommendation Systems

Single Debugging Session Crowd Debugging Sessions Debugging Information

Positive feedback

Collect data Store data

Transform information

A B C

D

Fig 2 Overview of the Swarm Debugging approach

24 Information Foraging

Information Foraging Theory (IFT) is based on the optimal foraging theorydeveloped by Pirolli and Card [27] to understand how people search for infor-mation IFT is rooted in biology studies and theories of how animals hunt forfood It was extended to debugging by Lawrance et al[27]

However no previous work proposes the sharing of knowledge related todebugging activities Differently from works that use IFT on a model onepreyone predator [28] we are interested in many developers working inde-pendently in many debugging sessions and sharing information to allow SI toemerge Thus debugging becomes a foraging process in a SI environment

These conceptsmdashSI and IFTmdashhave led to the design of a crowd approachapplied to debugging activities a different collective way of doing debuggingthat collects shares retrieves information from (previous and current) debug-ging sessions to support (current and future) debugging sessions

3 The Swarm Debugging Approach

Swarm Debugging (SD) uses swarm intelligence applied to interactive debug-ging data to create knowledge for supporting software development activitiesSwarm Debugging works as follows

Swarm Debugging the Collective Intelligence on Interactive Debugging 9

First several developers perform their individual independent debuggingactivities During these activities debugging events are collected by listeners(Label A in Figure 2) for example breakpoints-toggling and stepping events(Label B in Figure 2) that are then stored in a debugging-knowledge reposi-tory (Label C in Figure 2) For accessing this repository services are definedand implemented in the SDI For example stored events are processed bydedicated algorithms (Label D in Figure 2) (1) to create (several types of)visualizations (2) to offer (distinct ways of) searching and (3) to provide rec-ommendations to assist developers during debugging Recommendations arerelated to the locations where to toggle breakpoints Storing and using theseevents allow sharing developersrsquo knowledge among developers creating a col-lective intelligence about the software systems and their debugging

We chose to instrument the Eclipse IDE a popular IDE to implementSwarm Debugging and to reach a large number of users Also we use services inthe cloud to collect the debugging events to process these events and to providevisualizations and recommendations from these events Thus we decoupleddata collection from data usage allowing other researcherstools vendors touse the collected data

During debugging developers analyze the code toggling breakpoints andstepping in and through statements While traditional dynamic analysis ap-proaches collect all interactions states or events SD collects only invocationsexplicitly explored by developers SDI collects only visited areas and paths(chains of invocations by egStep Into or F5 in Eclipse IDE) and thus doesnot suffer from performance or memory issues as omniscient debuggers [29] ortracing-based approaches could

Our decision to record information about breakpoints and stepping is wellsupported by a study from Beller et al [30] A finding of this study is thatsetting breakpoints and stepping through code are the most used debuggingfeatures They showed that most of the recorded debugging events are relatedto the creation (4544) removal (4362) or adjustment of breakpoints hittingthem during debugging and stepping through the source code Furthermoreother advanced debugging features like defining watches and modifying vari-able values have been much less used [30]

4 SDI in a Nutshell

To evaluate the Swarm Debugging approach we have implemented the SwarmDebugging Infrastructure (see httpsgithubcomSwarmDebugging)The Swarm Debugging Infrastructure (SDI) [17] provides a set of tools forcollecting storing sharing retrieving and visualizing data collected duringdevelopersrsquo debugging activities The SDI is an Eclipse IDE11 plug-in inte-grated with Eclipse Debug core It is organized in three main modules (1) theSwarm Debugging Services (2) the Swarm Debugging Tracer and (3) Swarm

11 httpswwweclipseorg

10Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 3 GV elements - Types (nodes) invocations (edge) and Task filter area

Debugging Views All the implementation details of SDI are available in theAppendix section

41 Swarm Debugging Global View

Swarm Debugging Global View (GV) is a call graph for modeling softwarebased on directed call graph [31] to explicit the hierarchical relationship byinvocated methods This visualization use rounded gray boxes (Figure 3-A) torepresent types or classes (nodes) and oriented arrows (Figure 3-B) to expressinvocations (edges) GV is built using previous debugging session context datacollected by developers for different tasks

GV was implemented using CytoscapeJS [32] a Graph API JavaScriptframework applying an automatic layout manager breadthfirst As a web appli-cation the SD visualisations can be integrated into an Eclipse view as an SWTBrowser Widget or accessed through a traditional browser such as MozillaFirefox or Google Chrome

In this view the grey boxes are types that developers visited during debug-ging sessions The edges represent method calls (Step Into or F5 on Eclipse)performed by all developers in all traced tasks on a software project Eachedge colour represents a task and line thickness is proportional to the numberof invocations Each debugging session contributes with a context generat-ing the visualisation combining all collected invocations The visualisation isorganised in layers or stacks and each line is a layer of invocations The start-ing points (non-invoked methods) are allocated on top of a tree the adjacent

Swarm Debugging the Collective Intelligence on Interactive Debugging 11

Fig 4 GV on all tasks

nodes in an invocation sequence Besides developers can directly go to a typein the Eclipse Editor by double-clicking over a node in the diagram In the leftcorner developers can use radio buttons to filter invocations by task (figure 3-C) showing the paths used by developers during previous debugging sessionsby a task Finally developers can use the mouse to pan and zoom inout onthe visualisation Figure 4 shows an example of GV with all tasks for JabRefsystem and we have data about 8 tasks

GV is a contextual visualization that shows only the paths explicitlyand intentionally visited by developers including type declarations andmethod invocations explored by developers based on their decisions

5 Using SDI to Understand Debugging Activities

The first benefit of SDI is the fact that it allows for collecting detailed in-formation about debugging sessions Using this information researchers caninvestigate developers behaviors during debugging activities To illustrate thispoint we conducted two experiments using SDI to understand developers de-bugging habits the times and effort with which they set breakpoints and thelocations where they set breakpoints

Our analysis builds upon three independent sets of observations involvingin total three systems Studies 1 and 2 involved JabRef PDFSaM and Raptoras subject systems We analysed 45 video-recorded debugging sessions avail-able from our own collected videos (Study 1) and an empirical study performedby Jiang et al [33] (Study 2)

In this study we answered the following research questions

RQ1 Is there a correlation between the time of the first breakpoint and a de-bugging taskrsquos elapsed time

RQ2 What is the effort in time for setting the first breakpoint in relation to thedebugging taskrsquos elapsed time

12Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

In this section we elaborate more on each of the studies

51 Study 1 Observational Study on JabRef

511 Subject System

To conduct this first study we selected JabRef12 version 32 as subject sys-tem This choice was motivated by the fact that JabRefrsquos domain is easy tounderstand thus reducing any learning effect It is composed of relatively inde-pendent packages and classes ie high cohesion low coupling thus reducingthe potential commingle effect of low code quality

512 Participants

We recruited eight male professional developers via an Internet-based free-lancer service13 Two participants are experts and three are intermediate inJava Developers self-reported their expertise levels which thus should betaken with caution Also we recruited 12 undergraduate and graduate stu-dents at Polytechnique Montreal to participate in our study We surveyedall the participantsrsquo background information before the study14 The surveyincluded questions about participantsrsquo self-assessment on their level of pro-gramming expertise (Java IDE and Eclipse) gender first natural languageschooling level and knowledge about TDD interactive debugging and whyusually they use a debugger All participants stated that they had experiencein Java and worked regularly with the debugger of Eclipse

513 Task Description

We selected five defects reported in the issue-tracking system of JabRef Wechose the task of fixing the faults that would potentially require developers toset breakpoints in different Java classes To ensure this we manually conductedthe debugging ourselves and verified that for understanding the root causeof the faults we had to set at least two breakpoints during our interactivedebugging sessions Then we asked participants to find the locations of thefaults described in Issues 318 667 669 993 and 1026 Table 1 summarisesthe faults using their titles from the issue-tracking system

12 httpwwwjabreforg13 httpswwwfreelancercom14 Survey available on httpsgooglformsdxCQaBke2l2cqjB42

Swarm Debugging the Collective Intelligence on Interactive Debugging 13

Table 1 Summary of the issues considered in JabRef in Study 1

Issues Summaries

318 ldquoNormalize to Bibtex name formatrdquo

667 ldquohashpound sign causes URL link to failrdquo

669 ldquoJabRef 3132 writes bib file in a format

that it will not readrdquo

993 ldquoIssues in BibTeX source opens save dialog

and opens dialog Problem with parsing entryrsquo

multiple timesrdquo

1026 ldquoJabref removes comments

inside the Bibtex coderdquo

514 Artifacts and Working Environment

We provided the participants with a tutorial15 explaining how to install andconfigure the tools required for the study and how to use them through awarm-up task We also presented a video16 to guide the participants during thewarm-up task In a second document we described the five faults and the stepsto reproduce them We also provided participants with a video demonstratingstep-by-step how to reproduce the five defects to help them get started

We provided a pre-configured Eclipse workspace to the participants andasked them to install Java 8 Eclipse Mars 2 with the Swarm Debugging Tracerplug-in [17] to collect automatically breakpoint-related events The Eclipseworkspace contained two Java projects a Tetris game for the warm-up taskand JabRef v32 for the study We also required that the participants installand configure the Open Broadcaster Software17 (OBS) open-source softwarefor live streaming and recording We used the OBS to record the participantsrsquoscreens

515 Study Procedure

After installing their environments we asked participants to perform a warm-up task with a Tetris game The task consisted of starting a debugging sessionsetting a breakpoint and debugging the Tetris program to locate a givenmethod We used this task to confirm that the participantsrsquo environmentswere properly configured and also to accustom the participants with the studysettings It was a trivial task that we also used to filter the participants whowould have too little knowledge of Java Eclipse and Eclipse Java debugger

15 httpswarmdebuggingorgpublication16 httpsyoutubeU1sBMpfL2jc17 httpsobsprojectcom

14Please give a shorter version with authorrunning and titlerunning prior to maketitle

All participants who participated in our study correctly executed the warm-uptask

After performing the warm-up task each participant performed debuggingto locate the faults We established a maximum limit of one-hour per task andinformed the participants that the task would require about 20 minutes foreach fault which we will discuss as a possible threat to validity We based thislimit on previous experiences with these tasks during mock trials After theparticipants performed each task we asked them to answer a post-experimentquestionnaire to collect information about the study asking if they found thefaults where were the faults why the faults happened if they were tired anda general summary of their debugging experience

516 Data Collection

The Swarm Debugging Tracer plug-in automatically and transparently col-lected all debugging data (breakpoints stepping method invocations) Alsowe recorded the participantrsquos screens during their debugging sessions withOBS We collected the following data

ndash 28 video recordings one per participant and task which are essential tocontrol the quality of each session and to produce a reliable and repro-ducible chain of evidence for our results

ndash The statements (lines in the source code) where the participants set break-points We considered the following types of statements because they arerepresentative of the main concepts in any programming languagesndash call methodfunction invocationsndash return returns of valuesndash assignment settings of valuesndash if-statement conditional statementsndash while-loop loops iterations

ndash Summaries of the results of the study one per participant via a question-naire which included the following questionsndash Did you locate the faultndash Where was the faultndash Why did the fault happenndash Were you tiredndash How was your debugging experience

Based on this data we obtained or computed the following metrics perparticipant and task

ndash Start Time (ST ) the timestamp when the participant started a task Weanalysed each video and we started to count when effectively the partic-ipant started a task ie when she started the Swarm Debugging Tracerplug-in for example

ndash Time of First Breakpoint (FB) the time when the participant set her firstbreakpoint

ndash End time (T ) the time when the participant finished a task

Swarm Debugging the Collective Intelligence on Interactive Debugging 15

ndash Elapsed End time (ET ) ET = T minus STndash Elapsed Time First Breakpoint (EF ) EF = FB minus ST

We manually verified whether participants were successful or not at com-pleting their tasks by analysing the answers provided in the questionnaireand the videos We knew the locations of the faults because all tasks weresolved by JabRefrsquos developers who completed the corresponding reports inthe issue-tracking system with the changes that they made

52 Study 2 Empirical Study on PDFSaM and Raptor

The second study consisted of the re-analysis of 20 videos of debugging sessionsavailable from an empirical study on change-impact analysis with professionaldevelopers [33] The authors conducted their work in two phases In the firstphase they asked nine developers to read two fault reports from two open-source systems and to fix these faults The objective was to observe the devel-opersrsquo behaviour as they fixed the faults In the second phase they analysedthe developersrsquo behaviour to determine whether the developers used any toolsfor change-impact analysis and if not whether they performed change-impactanalysis manually

The two systems analysed in their study are PDF Split and Merge18 (PDF-SaM) and Raptor19 They chose one fault report per system for their studyThey chose these systems due to their non-trivial size and because the pur-poses and domains of these systems were clear and easy to understand [33]The choice of the fault reports followed the criteria that they were alreadysolved and that they could be understood by developers who did not knowthe systems Alongside each fault report they presented the developers withinformation about the systems their purpose their main entry points andinstructions for replicating the faults

53 Results

As can be noticed Studies 1 and 2 have different approaches The tasks inStudy 1 were fault location tasks developers did not correct the faults whilethe ones in Study 2 were fault correction tasks Moreover Study 1 exploredfive different faults while Study 2 only analysed one fault per system Thecollected data provide a diversity of cases and allow a rich in-depth view ofhow developers set breakpoints during different debugging sessions

In the following we present the results regarding each research questionaddressed in the two studies

18 httpwwwpdfsamorg19 httpscodegooglecompraptor-chess-interface

16Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

MFB =EF

ET(1)

Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

Tasks Average Times (min) Std Devs (min)

318 44 64

667 28 29

669 22 25

993 25 25

1026 25 17

PdfSam 54 18

Raptor 59 13

Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

Swarm Debugging the Collective Intelligence on Interactive Debugging 17

RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

f(x) =α

xβ(2)

where α = 12 and β = 044

Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

This finding also corroborates previous results found with a different set oftasks [17]

18Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

Table 3 Study 1 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 111 53

if-statement 39 19

assignment 36 17

return 18 10

while-loop 3 1

Table 4 Study 2 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 43 43

if-statement 22 22

assignment 27 27

return 4 4

while-loop 4 4

13

Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

Swarm Debugging the Collective Intelligence on Interactive Debugging 19

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

JabRefDesktopopenExternalViewer(metaData()

linktoString() field)

Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

Tasks Classes Lines of Code Breakpoints

0318 AuthorsFormatter 43 5

0318 AuthorsFormatter 131 3

0667 BasePanel 935 2

0667 BasePanel 969 3

0667 JabRefDesktop 430 2

0669 OpenDatabaseAction 268 2

0669 OpenDatabaseAction 433 4

0669 OpenDatabaseAction 451 4

0993 EntryEditor 717 2

0993 EntryEditor 720 2

0993 EntryEditor 723 2

0993 BibDatabase 187 2

0993 BibDatabase 456 2

1026 EntryEditor 1184 2

1026 BibtexParser 160 2

20Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

Classes Lines of Code Breakpoints

PdfReader 230 2

PdfReader 806 2

PdfReader 1923 2

ConsoleServicesFacade 89 2

ConsoleClient 81 2

PdfUtility 94 2

PdfUtility 96 2

PdfUtility 102 2

Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

Classes Lines of Code Breakpoints

icsUtils 333 3

Game 1751 2

ExamineController 41 2

ExamineController 84 3

ExamineController 87 2

ExamineController 92 2

When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

Swarm Debugging the Collective Intelligence on Interactive Debugging 21

Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

Classes Lines of Code Breakpoints

BibtexParser 138151159 222

160165168 323

176198199299 2222

EntryEditor 717720721 342

723837842 232

11841393 32

BibDatabase 175187223456 2326

OpenDatabaseAction 433450451 424

JabRefDesktop 4084430 223

SaveDatabaseAction 177188 42

BasePanel 935969 25

AuthorsFormatter 43131 54

EntryTableTransferHandler 346 2

FieldTextMenu 84 2

JabRefFrame 1119 2

JabRefMain 8 5

URLUtil 95 2

Fig 6 Methods with 5 or more breakpoints

Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

22Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 9 Study 1 - Breakpoints by class across different tasks

Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

SaveDatabaseAction Yes Yes Yes 7 2

BasePanel Yes Yes Yes Yes 14 7

JabRefDesktop Yes Yes 9 4

EntryEditor Yes Yes Yes 36 4

BibtexParser Yes Yes Yes 44 6

OpenDatabaseAction Yes Yes Yes 19 13

JabRef Yes Yes Yes 3 3

JabRefMain Yes Yes Yes Yes 5 4

URLUtil Yes Yes 4 2

BibDatabase Yes Yes Yes 19 4

points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

6 Evaluation of Swarm Debugging using GV

To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

Swarm Debugging the Collective Intelligence on Interactive Debugging 23

61 Study design

The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

611 Subject System

For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

612 Participants

Fig 7 Java expertise

To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

20 httpwwwjabreforg21 httpswwwfreelancercom

24Please give a shorter version with authorrunning and titlerunning prior to maketitle

613 Task Description

We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

614 Artifacts and Working Environment

After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

615 Study Procedure

The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

22 The full qualitative evaluation survey is available on httpsgooglforms

c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

Swarm Debugging the Collective Intelligence on Interactive Debugging 25

group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

616 Data Collection

In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

62 Results

We now discuss the results of our evaluation

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

26Please give a shorter version with authorrunning and titlerunning prior to maketitle

number of participants who could propose a solution and the correctness ofthe solutions

For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

Fig 8 GV for Task 0318

For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

Fig 9 GV for Task 0667

Swarm Debugging the Collective Intelligence on Interactive Debugging 27

Fig 10 GV for Task 0669

Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

13

Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

28Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 11 GV usefulness - experimental phase one

Fig 12 GV usefulness - experimental phase two

The analysis of our results suggests that GV is useful to support software-maintenance tasks

Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

Swarm Debugging the Collective Intelligence on Interactive Debugging 29

Table 10 Results from control and experimental groups (average)

Task 0993

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000255 000340 -44 126

Time to start 000444 000518 -33 112

Elapsed time 003008 001605 843 53

Task 1026

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000242 000448 -126 177

Time to start 000402 000343 19 92

Elapsed time 002458 002041 257 83

63 Comparing Results from the Control and Experimental Groups

We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 8: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

8Please give a shorter version with authorrunning and titlerunning prior to maketitle

emerge from self-organizing developers growing in steps and following theirchanging knowledge and the changing wishes of the user community ie atypical example of swarm intelligence

Dev1

Dev2

Dev3

DevN

VisualisationsSearching Tools

Recommendation Systems

Single Debugging Session Crowd Debugging Sessions Debugging Information

Positive feedback

Collect data Store data

Transform information

A B C

D

Fig 2 Overview of the Swarm Debugging approach

24 Information Foraging

Information Foraging Theory (IFT) is based on the optimal foraging theorydeveloped by Pirolli and Card [27] to understand how people search for infor-mation IFT is rooted in biology studies and theories of how animals hunt forfood It was extended to debugging by Lawrance et al[27]

However no previous work proposes the sharing of knowledge related todebugging activities Differently from works that use IFT on a model onepreyone predator [28] we are interested in many developers working inde-pendently in many debugging sessions and sharing information to allow SI toemerge Thus debugging becomes a foraging process in a SI environment

These conceptsmdashSI and IFTmdashhave led to the design of a crowd approachapplied to debugging activities a different collective way of doing debuggingthat collects shares retrieves information from (previous and current) debug-ging sessions to support (current and future) debugging sessions

3 The Swarm Debugging Approach

Swarm Debugging (SD) uses swarm intelligence applied to interactive debug-ging data to create knowledge for supporting software development activitiesSwarm Debugging works as follows

Swarm Debugging the Collective Intelligence on Interactive Debugging 9

First several developers perform their individual independent debuggingactivities During these activities debugging events are collected by listeners(Label A in Figure 2) for example breakpoints-toggling and stepping events(Label B in Figure 2) that are then stored in a debugging-knowledge reposi-tory (Label C in Figure 2) For accessing this repository services are definedand implemented in the SDI For example stored events are processed bydedicated algorithms (Label D in Figure 2) (1) to create (several types of)visualizations (2) to offer (distinct ways of) searching and (3) to provide rec-ommendations to assist developers during debugging Recommendations arerelated to the locations where to toggle breakpoints Storing and using theseevents allow sharing developersrsquo knowledge among developers creating a col-lective intelligence about the software systems and their debugging

We chose to instrument the Eclipse IDE a popular IDE to implementSwarm Debugging and to reach a large number of users Also we use services inthe cloud to collect the debugging events to process these events and to providevisualizations and recommendations from these events Thus we decoupleddata collection from data usage allowing other researcherstools vendors touse the collected data

During debugging developers analyze the code toggling breakpoints andstepping in and through statements While traditional dynamic analysis ap-proaches collect all interactions states or events SD collects only invocationsexplicitly explored by developers SDI collects only visited areas and paths(chains of invocations by egStep Into or F5 in Eclipse IDE) and thus doesnot suffer from performance or memory issues as omniscient debuggers [29] ortracing-based approaches could

Our decision to record information about breakpoints and stepping is wellsupported by a study from Beller et al [30] A finding of this study is thatsetting breakpoints and stepping through code are the most used debuggingfeatures They showed that most of the recorded debugging events are relatedto the creation (4544) removal (4362) or adjustment of breakpoints hittingthem during debugging and stepping through the source code Furthermoreother advanced debugging features like defining watches and modifying vari-able values have been much less used [30]

4 SDI in a Nutshell

To evaluate the Swarm Debugging approach we have implemented the SwarmDebugging Infrastructure (see httpsgithubcomSwarmDebugging)The Swarm Debugging Infrastructure (SDI) [17] provides a set of tools forcollecting storing sharing retrieving and visualizing data collected duringdevelopersrsquo debugging activities The SDI is an Eclipse IDE11 plug-in inte-grated with Eclipse Debug core It is organized in three main modules (1) theSwarm Debugging Services (2) the Swarm Debugging Tracer and (3) Swarm

11 httpswwweclipseorg

10Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 3 GV elements - Types (nodes) invocations (edge) and Task filter area

Debugging Views All the implementation details of SDI are available in theAppendix section

41 Swarm Debugging Global View

Swarm Debugging Global View (GV) is a call graph for modeling softwarebased on directed call graph [31] to explicit the hierarchical relationship byinvocated methods This visualization use rounded gray boxes (Figure 3-A) torepresent types or classes (nodes) and oriented arrows (Figure 3-B) to expressinvocations (edges) GV is built using previous debugging session context datacollected by developers for different tasks

GV was implemented using CytoscapeJS [32] a Graph API JavaScriptframework applying an automatic layout manager breadthfirst As a web appli-cation the SD visualisations can be integrated into an Eclipse view as an SWTBrowser Widget or accessed through a traditional browser such as MozillaFirefox or Google Chrome

In this view the grey boxes are types that developers visited during debug-ging sessions The edges represent method calls (Step Into or F5 on Eclipse)performed by all developers in all traced tasks on a software project Eachedge colour represents a task and line thickness is proportional to the numberof invocations Each debugging session contributes with a context generat-ing the visualisation combining all collected invocations The visualisation isorganised in layers or stacks and each line is a layer of invocations The start-ing points (non-invoked methods) are allocated on top of a tree the adjacent

Swarm Debugging the Collective Intelligence on Interactive Debugging 11

Fig 4 GV on all tasks

nodes in an invocation sequence Besides developers can directly go to a typein the Eclipse Editor by double-clicking over a node in the diagram In the leftcorner developers can use radio buttons to filter invocations by task (figure 3-C) showing the paths used by developers during previous debugging sessionsby a task Finally developers can use the mouse to pan and zoom inout onthe visualisation Figure 4 shows an example of GV with all tasks for JabRefsystem and we have data about 8 tasks

GV is a contextual visualization that shows only the paths explicitlyand intentionally visited by developers including type declarations andmethod invocations explored by developers based on their decisions

5 Using SDI to Understand Debugging Activities

The first benefit of SDI is the fact that it allows for collecting detailed in-formation about debugging sessions Using this information researchers caninvestigate developers behaviors during debugging activities To illustrate thispoint we conducted two experiments using SDI to understand developers de-bugging habits the times and effort with which they set breakpoints and thelocations where they set breakpoints

Our analysis builds upon three independent sets of observations involvingin total three systems Studies 1 and 2 involved JabRef PDFSaM and Raptoras subject systems We analysed 45 video-recorded debugging sessions avail-able from our own collected videos (Study 1) and an empirical study performedby Jiang et al [33] (Study 2)

In this study we answered the following research questions

RQ1 Is there a correlation between the time of the first breakpoint and a de-bugging taskrsquos elapsed time

RQ2 What is the effort in time for setting the first breakpoint in relation to thedebugging taskrsquos elapsed time

12Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

In this section we elaborate more on each of the studies

51 Study 1 Observational Study on JabRef

511 Subject System

To conduct this first study we selected JabRef12 version 32 as subject sys-tem This choice was motivated by the fact that JabRefrsquos domain is easy tounderstand thus reducing any learning effect It is composed of relatively inde-pendent packages and classes ie high cohesion low coupling thus reducingthe potential commingle effect of low code quality

512 Participants

We recruited eight male professional developers via an Internet-based free-lancer service13 Two participants are experts and three are intermediate inJava Developers self-reported their expertise levels which thus should betaken with caution Also we recruited 12 undergraduate and graduate stu-dents at Polytechnique Montreal to participate in our study We surveyedall the participantsrsquo background information before the study14 The surveyincluded questions about participantsrsquo self-assessment on their level of pro-gramming expertise (Java IDE and Eclipse) gender first natural languageschooling level and knowledge about TDD interactive debugging and whyusually they use a debugger All participants stated that they had experiencein Java and worked regularly with the debugger of Eclipse

513 Task Description

We selected five defects reported in the issue-tracking system of JabRef Wechose the task of fixing the faults that would potentially require developers toset breakpoints in different Java classes To ensure this we manually conductedthe debugging ourselves and verified that for understanding the root causeof the faults we had to set at least two breakpoints during our interactivedebugging sessions Then we asked participants to find the locations of thefaults described in Issues 318 667 669 993 and 1026 Table 1 summarisesthe faults using their titles from the issue-tracking system

12 httpwwwjabreforg13 httpswwwfreelancercom14 Survey available on httpsgooglformsdxCQaBke2l2cqjB42

Swarm Debugging the Collective Intelligence on Interactive Debugging 13

Table 1 Summary of the issues considered in JabRef in Study 1

Issues Summaries

318 ldquoNormalize to Bibtex name formatrdquo

667 ldquohashpound sign causes URL link to failrdquo

669 ldquoJabRef 3132 writes bib file in a format

that it will not readrdquo

993 ldquoIssues in BibTeX source opens save dialog

and opens dialog Problem with parsing entryrsquo

multiple timesrdquo

1026 ldquoJabref removes comments

inside the Bibtex coderdquo

514 Artifacts and Working Environment

We provided the participants with a tutorial15 explaining how to install andconfigure the tools required for the study and how to use them through awarm-up task We also presented a video16 to guide the participants during thewarm-up task In a second document we described the five faults and the stepsto reproduce them We also provided participants with a video demonstratingstep-by-step how to reproduce the five defects to help them get started

We provided a pre-configured Eclipse workspace to the participants andasked them to install Java 8 Eclipse Mars 2 with the Swarm Debugging Tracerplug-in [17] to collect automatically breakpoint-related events The Eclipseworkspace contained two Java projects a Tetris game for the warm-up taskand JabRef v32 for the study We also required that the participants installand configure the Open Broadcaster Software17 (OBS) open-source softwarefor live streaming and recording We used the OBS to record the participantsrsquoscreens

515 Study Procedure

After installing their environments we asked participants to perform a warm-up task with a Tetris game The task consisted of starting a debugging sessionsetting a breakpoint and debugging the Tetris program to locate a givenmethod We used this task to confirm that the participantsrsquo environmentswere properly configured and also to accustom the participants with the studysettings It was a trivial task that we also used to filter the participants whowould have too little knowledge of Java Eclipse and Eclipse Java debugger

15 httpswarmdebuggingorgpublication16 httpsyoutubeU1sBMpfL2jc17 httpsobsprojectcom

14Please give a shorter version with authorrunning and titlerunning prior to maketitle

All participants who participated in our study correctly executed the warm-uptask

After performing the warm-up task each participant performed debuggingto locate the faults We established a maximum limit of one-hour per task andinformed the participants that the task would require about 20 minutes foreach fault which we will discuss as a possible threat to validity We based thislimit on previous experiences with these tasks during mock trials After theparticipants performed each task we asked them to answer a post-experimentquestionnaire to collect information about the study asking if they found thefaults where were the faults why the faults happened if they were tired anda general summary of their debugging experience

516 Data Collection

The Swarm Debugging Tracer plug-in automatically and transparently col-lected all debugging data (breakpoints stepping method invocations) Alsowe recorded the participantrsquos screens during their debugging sessions withOBS We collected the following data

ndash 28 video recordings one per participant and task which are essential tocontrol the quality of each session and to produce a reliable and repro-ducible chain of evidence for our results

ndash The statements (lines in the source code) where the participants set break-points We considered the following types of statements because they arerepresentative of the main concepts in any programming languagesndash call methodfunction invocationsndash return returns of valuesndash assignment settings of valuesndash if-statement conditional statementsndash while-loop loops iterations

ndash Summaries of the results of the study one per participant via a question-naire which included the following questionsndash Did you locate the faultndash Where was the faultndash Why did the fault happenndash Were you tiredndash How was your debugging experience

Based on this data we obtained or computed the following metrics perparticipant and task

ndash Start Time (ST ) the timestamp when the participant started a task Weanalysed each video and we started to count when effectively the partic-ipant started a task ie when she started the Swarm Debugging Tracerplug-in for example

ndash Time of First Breakpoint (FB) the time when the participant set her firstbreakpoint

ndash End time (T ) the time when the participant finished a task

Swarm Debugging the Collective Intelligence on Interactive Debugging 15

ndash Elapsed End time (ET ) ET = T minus STndash Elapsed Time First Breakpoint (EF ) EF = FB minus ST

We manually verified whether participants were successful or not at com-pleting their tasks by analysing the answers provided in the questionnaireand the videos We knew the locations of the faults because all tasks weresolved by JabRefrsquos developers who completed the corresponding reports inthe issue-tracking system with the changes that they made

52 Study 2 Empirical Study on PDFSaM and Raptor

The second study consisted of the re-analysis of 20 videos of debugging sessionsavailable from an empirical study on change-impact analysis with professionaldevelopers [33] The authors conducted their work in two phases In the firstphase they asked nine developers to read two fault reports from two open-source systems and to fix these faults The objective was to observe the devel-opersrsquo behaviour as they fixed the faults In the second phase they analysedthe developersrsquo behaviour to determine whether the developers used any toolsfor change-impact analysis and if not whether they performed change-impactanalysis manually

The two systems analysed in their study are PDF Split and Merge18 (PDF-SaM) and Raptor19 They chose one fault report per system for their studyThey chose these systems due to their non-trivial size and because the pur-poses and domains of these systems were clear and easy to understand [33]The choice of the fault reports followed the criteria that they were alreadysolved and that they could be understood by developers who did not knowthe systems Alongside each fault report they presented the developers withinformation about the systems their purpose their main entry points andinstructions for replicating the faults

53 Results

As can be noticed Studies 1 and 2 have different approaches The tasks inStudy 1 were fault location tasks developers did not correct the faults whilethe ones in Study 2 were fault correction tasks Moreover Study 1 exploredfive different faults while Study 2 only analysed one fault per system Thecollected data provide a diversity of cases and allow a rich in-depth view ofhow developers set breakpoints during different debugging sessions

In the following we present the results regarding each research questionaddressed in the two studies

18 httpwwwpdfsamorg19 httpscodegooglecompraptor-chess-interface

16Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

MFB =EF

ET(1)

Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

Tasks Average Times (min) Std Devs (min)

318 44 64

667 28 29

669 22 25

993 25 25

1026 25 17

PdfSam 54 18

Raptor 59 13

Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

Swarm Debugging the Collective Intelligence on Interactive Debugging 17

RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

f(x) =α

xβ(2)

where α = 12 and β = 044

Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

This finding also corroborates previous results found with a different set oftasks [17]

18Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

Table 3 Study 1 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 111 53

if-statement 39 19

assignment 36 17

return 18 10

while-loop 3 1

Table 4 Study 2 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 43 43

if-statement 22 22

assignment 27 27

return 4 4

while-loop 4 4

13

Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

Swarm Debugging the Collective Intelligence on Interactive Debugging 19

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

JabRefDesktopopenExternalViewer(metaData()

linktoString() field)

Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

Tasks Classes Lines of Code Breakpoints

0318 AuthorsFormatter 43 5

0318 AuthorsFormatter 131 3

0667 BasePanel 935 2

0667 BasePanel 969 3

0667 JabRefDesktop 430 2

0669 OpenDatabaseAction 268 2

0669 OpenDatabaseAction 433 4

0669 OpenDatabaseAction 451 4

0993 EntryEditor 717 2

0993 EntryEditor 720 2

0993 EntryEditor 723 2

0993 BibDatabase 187 2

0993 BibDatabase 456 2

1026 EntryEditor 1184 2

1026 BibtexParser 160 2

20Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

Classes Lines of Code Breakpoints

PdfReader 230 2

PdfReader 806 2

PdfReader 1923 2

ConsoleServicesFacade 89 2

ConsoleClient 81 2

PdfUtility 94 2

PdfUtility 96 2

PdfUtility 102 2

Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

Classes Lines of Code Breakpoints

icsUtils 333 3

Game 1751 2

ExamineController 41 2

ExamineController 84 3

ExamineController 87 2

ExamineController 92 2

When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

Swarm Debugging the Collective Intelligence on Interactive Debugging 21

Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

Classes Lines of Code Breakpoints

BibtexParser 138151159 222

160165168 323

176198199299 2222

EntryEditor 717720721 342

723837842 232

11841393 32

BibDatabase 175187223456 2326

OpenDatabaseAction 433450451 424

JabRefDesktop 4084430 223

SaveDatabaseAction 177188 42

BasePanel 935969 25

AuthorsFormatter 43131 54

EntryTableTransferHandler 346 2

FieldTextMenu 84 2

JabRefFrame 1119 2

JabRefMain 8 5

URLUtil 95 2

Fig 6 Methods with 5 or more breakpoints

Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

22Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 9 Study 1 - Breakpoints by class across different tasks

Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

SaveDatabaseAction Yes Yes Yes 7 2

BasePanel Yes Yes Yes Yes 14 7

JabRefDesktop Yes Yes 9 4

EntryEditor Yes Yes Yes 36 4

BibtexParser Yes Yes Yes 44 6

OpenDatabaseAction Yes Yes Yes 19 13

JabRef Yes Yes Yes 3 3

JabRefMain Yes Yes Yes Yes 5 4

URLUtil Yes Yes 4 2

BibDatabase Yes Yes Yes 19 4

points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

6 Evaluation of Swarm Debugging using GV

To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

Swarm Debugging the Collective Intelligence on Interactive Debugging 23

61 Study design

The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

611 Subject System

For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

612 Participants

Fig 7 Java expertise

To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

20 httpwwwjabreforg21 httpswwwfreelancercom

24Please give a shorter version with authorrunning and titlerunning prior to maketitle

613 Task Description

We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

614 Artifacts and Working Environment

After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

615 Study Procedure

The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

22 The full qualitative evaluation survey is available on httpsgooglforms

c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

Swarm Debugging the Collective Intelligence on Interactive Debugging 25

group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

616 Data Collection

In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

62 Results

We now discuss the results of our evaluation

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

26Please give a shorter version with authorrunning and titlerunning prior to maketitle

number of participants who could propose a solution and the correctness ofthe solutions

For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

Fig 8 GV for Task 0318

For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

Fig 9 GV for Task 0667

Swarm Debugging the Collective Intelligence on Interactive Debugging 27

Fig 10 GV for Task 0669

Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

13

Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

28Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 11 GV usefulness - experimental phase one

Fig 12 GV usefulness - experimental phase two

The analysis of our results suggests that GV is useful to support software-maintenance tasks

Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

Swarm Debugging the Collective Intelligence on Interactive Debugging 29

Table 10 Results from control and experimental groups (average)

Task 0993

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000255 000340 -44 126

Time to start 000444 000518 -33 112

Elapsed time 003008 001605 843 53

Task 1026

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000242 000448 -126 177

Time to start 000402 000343 19 92

Elapsed time 002458 002041 257 83

63 Comparing Results from the Control and Experimental Groups

We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 9: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

Swarm Debugging the Collective Intelligence on Interactive Debugging 9

First several developers perform their individual independent debuggingactivities During these activities debugging events are collected by listeners(Label A in Figure 2) for example breakpoints-toggling and stepping events(Label B in Figure 2) that are then stored in a debugging-knowledge reposi-tory (Label C in Figure 2) For accessing this repository services are definedand implemented in the SDI For example stored events are processed bydedicated algorithms (Label D in Figure 2) (1) to create (several types of)visualizations (2) to offer (distinct ways of) searching and (3) to provide rec-ommendations to assist developers during debugging Recommendations arerelated to the locations where to toggle breakpoints Storing and using theseevents allow sharing developersrsquo knowledge among developers creating a col-lective intelligence about the software systems and their debugging

We chose to instrument the Eclipse IDE a popular IDE to implementSwarm Debugging and to reach a large number of users Also we use services inthe cloud to collect the debugging events to process these events and to providevisualizations and recommendations from these events Thus we decoupleddata collection from data usage allowing other researcherstools vendors touse the collected data

During debugging developers analyze the code toggling breakpoints andstepping in and through statements While traditional dynamic analysis ap-proaches collect all interactions states or events SD collects only invocationsexplicitly explored by developers SDI collects only visited areas and paths(chains of invocations by egStep Into or F5 in Eclipse IDE) and thus doesnot suffer from performance or memory issues as omniscient debuggers [29] ortracing-based approaches could

Our decision to record information about breakpoints and stepping is wellsupported by a study from Beller et al [30] A finding of this study is thatsetting breakpoints and stepping through code are the most used debuggingfeatures They showed that most of the recorded debugging events are relatedto the creation (4544) removal (4362) or adjustment of breakpoints hittingthem during debugging and stepping through the source code Furthermoreother advanced debugging features like defining watches and modifying vari-able values have been much less used [30]

4 SDI in a Nutshell

To evaluate the Swarm Debugging approach we have implemented the SwarmDebugging Infrastructure (see httpsgithubcomSwarmDebugging)The Swarm Debugging Infrastructure (SDI) [17] provides a set of tools forcollecting storing sharing retrieving and visualizing data collected duringdevelopersrsquo debugging activities The SDI is an Eclipse IDE11 plug-in inte-grated with Eclipse Debug core It is organized in three main modules (1) theSwarm Debugging Services (2) the Swarm Debugging Tracer and (3) Swarm

11 httpswwweclipseorg

10Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 3 GV elements - Types (nodes) invocations (edge) and Task filter area

Debugging Views All the implementation details of SDI are available in theAppendix section

41 Swarm Debugging Global View

Swarm Debugging Global View (GV) is a call graph for modeling softwarebased on directed call graph [31] to explicit the hierarchical relationship byinvocated methods This visualization use rounded gray boxes (Figure 3-A) torepresent types or classes (nodes) and oriented arrows (Figure 3-B) to expressinvocations (edges) GV is built using previous debugging session context datacollected by developers for different tasks

GV was implemented using CytoscapeJS [32] a Graph API JavaScriptframework applying an automatic layout manager breadthfirst As a web appli-cation the SD visualisations can be integrated into an Eclipse view as an SWTBrowser Widget or accessed through a traditional browser such as MozillaFirefox or Google Chrome

In this view the grey boxes are types that developers visited during debug-ging sessions The edges represent method calls (Step Into or F5 on Eclipse)performed by all developers in all traced tasks on a software project Eachedge colour represents a task and line thickness is proportional to the numberof invocations Each debugging session contributes with a context generat-ing the visualisation combining all collected invocations The visualisation isorganised in layers or stacks and each line is a layer of invocations The start-ing points (non-invoked methods) are allocated on top of a tree the adjacent

Swarm Debugging the Collective Intelligence on Interactive Debugging 11

Fig 4 GV on all tasks

nodes in an invocation sequence Besides developers can directly go to a typein the Eclipse Editor by double-clicking over a node in the diagram In the leftcorner developers can use radio buttons to filter invocations by task (figure 3-C) showing the paths used by developers during previous debugging sessionsby a task Finally developers can use the mouse to pan and zoom inout onthe visualisation Figure 4 shows an example of GV with all tasks for JabRefsystem and we have data about 8 tasks

GV is a contextual visualization that shows only the paths explicitlyand intentionally visited by developers including type declarations andmethod invocations explored by developers based on their decisions

5 Using SDI to Understand Debugging Activities

The first benefit of SDI is the fact that it allows for collecting detailed in-formation about debugging sessions Using this information researchers caninvestigate developers behaviors during debugging activities To illustrate thispoint we conducted two experiments using SDI to understand developers de-bugging habits the times and effort with which they set breakpoints and thelocations where they set breakpoints

Our analysis builds upon three independent sets of observations involvingin total three systems Studies 1 and 2 involved JabRef PDFSaM and Raptoras subject systems We analysed 45 video-recorded debugging sessions avail-able from our own collected videos (Study 1) and an empirical study performedby Jiang et al [33] (Study 2)

In this study we answered the following research questions

RQ1 Is there a correlation between the time of the first breakpoint and a de-bugging taskrsquos elapsed time

RQ2 What is the effort in time for setting the first breakpoint in relation to thedebugging taskrsquos elapsed time

12Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

In this section we elaborate more on each of the studies

51 Study 1 Observational Study on JabRef

511 Subject System

To conduct this first study we selected JabRef12 version 32 as subject sys-tem This choice was motivated by the fact that JabRefrsquos domain is easy tounderstand thus reducing any learning effect It is composed of relatively inde-pendent packages and classes ie high cohesion low coupling thus reducingthe potential commingle effect of low code quality

512 Participants

We recruited eight male professional developers via an Internet-based free-lancer service13 Two participants are experts and three are intermediate inJava Developers self-reported their expertise levels which thus should betaken with caution Also we recruited 12 undergraduate and graduate stu-dents at Polytechnique Montreal to participate in our study We surveyedall the participantsrsquo background information before the study14 The surveyincluded questions about participantsrsquo self-assessment on their level of pro-gramming expertise (Java IDE and Eclipse) gender first natural languageschooling level and knowledge about TDD interactive debugging and whyusually they use a debugger All participants stated that they had experiencein Java and worked regularly with the debugger of Eclipse

513 Task Description

We selected five defects reported in the issue-tracking system of JabRef Wechose the task of fixing the faults that would potentially require developers toset breakpoints in different Java classes To ensure this we manually conductedthe debugging ourselves and verified that for understanding the root causeof the faults we had to set at least two breakpoints during our interactivedebugging sessions Then we asked participants to find the locations of thefaults described in Issues 318 667 669 993 and 1026 Table 1 summarisesthe faults using their titles from the issue-tracking system

12 httpwwwjabreforg13 httpswwwfreelancercom14 Survey available on httpsgooglformsdxCQaBke2l2cqjB42

Swarm Debugging the Collective Intelligence on Interactive Debugging 13

Table 1 Summary of the issues considered in JabRef in Study 1

Issues Summaries

318 ldquoNormalize to Bibtex name formatrdquo

667 ldquohashpound sign causes URL link to failrdquo

669 ldquoJabRef 3132 writes bib file in a format

that it will not readrdquo

993 ldquoIssues in BibTeX source opens save dialog

and opens dialog Problem with parsing entryrsquo

multiple timesrdquo

1026 ldquoJabref removes comments

inside the Bibtex coderdquo

514 Artifacts and Working Environment

We provided the participants with a tutorial15 explaining how to install andconfigure the tools required for the study and how to use them through awarm-up task We also presented a video16 to guide the participants during thewarm-up task In a second document we described the five faults and the stepsto reproduce them We also provided participants with a video demonstratingstep-by-step how to reproduce the five defects to help them get started

We provided a pre-configured Eclipse workspace to the participants andasked them to install Java 8 Eclipse Mars 2 with the Swarm Debugging Tracerplug-in [17] to collect automatically breakpoint-related events The Eclipseworkspace contained two Java projects a Tetris game for the warm-up taskand JabRef v32 for the study We also required that the participants installand configure the Open Broadcaster Software17 (OBS) open-source softwarefor live streaming and recording We used the OBS to record the participantsrsquoscreens

515 Study Procedure

After installing their environments we asked participants to perform a warm-up task with a Tetris game The task consisted of starting a debugging sessionsetting a breakpoint and debugging the Tetris program to locate a givenmethod We used this task to confirm that the participantsrsquo environmentswere properly configured and also to accustom the participants with the studysettings It was a trivial task that we also used to filter the participants whowould have too little knowledge of Java Eclipse and Eclipse Java debugger

15 httpswarmdebuggingorgpublication16 httpsyoutubeU1sBMpfL2jc17 httpsobsprojectcom

14Please give a shorter version with authorrunning and titlerunning prior to maketitle

All participants who participated in our study correctly executed the warm-uptask

After performing the warm-up task each participant performed debuggingto locate the faults We established a maximum limit of one-hour per task andinformed the participants that the task would require about 20 minutes foreach fault which we will discuss as a possible threat to validity We based thislimit on previous experiences with these tasks during mock trials After theparticipants performed each task we asked them to answer a post-experimentquestionnaire to collect information about the study asking if they found thefaults where were the faults why the faults happened if they were tired anda general summary of their debugging experience

516 Data Collection

The Swarm Debugging Tracer plug-in automatically and transparently col-lected all debugging data (breakpoints stepping method invocations) Alsowe recorded the participantrsquos screens during their debugging sessions withOBS We collected the following data

ndash 28 video recordings one per participant and task which are essential tocontrol the quality of each session and to produce a reliable and repro-ducible chain of evidence for our results

ndash The statements (lines in the source code) where the participants set break-points We considered the following types of statements because they arerepresentative of the main concepts in any programming languagesndash call methodfunction invocationsndash return returns of valuesndash assignment settings of valuesndash if-statement conditional statementsndash while-loop loops iterations

ndash Summaries of the results of the study one per participant via a question-naire which included the following questionsndash Did you locate the faultndash Where was the faultndash Why did the fault happenndash Were you tiredndash How was your debugging experience

Based on this data we obtained or computed the following metrics perparticipant and task

ndash Start Time (ST ) the timestamp when the participant started a task Weanalysed each video and we started to count when effectively the partic-ipant started a task ie when she started the Swarm Debugging Tracerplug-in for example

ndash Time of First Breakpoint (FB) the time when the participant set her firstbreakpoint

ndash End time (T ) the time when the participant finished a task

Swarm Debugging the Collective Intelligence on Interactive Debugging 15

ndash Elapsed End time (ET ) ET = T minus STndash Elapsed Time First Breakpoint (EF ) EF = FB minus ST

We manually verified whether participants were successful or not at com-pleting their tasks by analysing the answers provided in the questionnaireand the videos We knew the locations of the faults because all tasks weresolved by JabRefrsquos developers who completed the corresponding reports inthe issue-tracking system with the changes that they made

52 Study 2 Empirical Study on PDFSaM and Raptor

The second study consisted of the re-analysis of 20 videos of debugging sessionsavailable from an empirical study on change-impact analysis with professionaldevelopers [33] The authors conducted their work in two phases In the firstphase they asked nine developers to read two fault reports from two open-source systems and to fix these faults The objective was to observe the devel-opersrsquo behaviour as they fixed the faults In the second phase they analysedthe developersrsquo behaviour to determine whether the developers used any toolsfor change-impact analysis and if not whether they performed change-impactanalysis manually

The two systems analysed in their study are PDF Split and Merge18 (PDF-SaM) and Raptor19 They chose one fault report per system for their studyThey chose these systems due to their non-trivial size and because the pur-poses and domains of these systems were clear and easy to understand [33]The choice of the fault reports followed the criteria that they were alreadysolved and that they could be understood by developers who did not knowthe systems Alongside each fault report they presented the developers withinformation about the systems their purpose their main entry points andinstructions for replicating the faults

53 Results

As can be noticed Studies 1 and 2 have different approaches The tasks inStudy 1 were fault location tasks developers did not correct the faults whilethe ones in Study 2 were fault correction tasks Moreover Study 1 exploredfive different faults while Study 2 only analysed one fault per system Thecollected data provide a diversity of cases and allow a rich in-depth view ofhow developers set breakpoints during different debugging sessions

In the following we present the results regarding each research questionaddressed in the two studies

18 httpwwwpdfsamorg19 httpscodegooglecompraptor-chess-interface

16Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

MFB =EF

ET(1)

Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

Tasks Average Times (min) Std Devs (min)

318 44 64

667 28 29

669 22 25

993 25 25

1026 25 17

PdfSam 54 18

Raptor 59 13

Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

Swarm Debugging the Collective Intelligence on Interactive Debugging 17

RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

f(x) =α

xβ(2)

where α = 12 and β = 044

Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

This finding also corroborates previous results found with a different set oftasks [17]

18Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

Table 3 Study 1 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 111 53

if-statement 39 19

assignment 36 17

return 18 10

while-loop 3 1

Table 4 Study 2 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 43 43

if-statement 22 22

assignment 27 27

return 4 4

while-loop 4 4

13

Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

Swarm Debugging the Collective Intelligence on Interactive Debugging 19

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

JabRefDesktopopenExternalViewer(metaData()

linktoString() field)

Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

Tasks Classes Lines of Code Breakpoints

0318 AuthorsFormatter 43 5

0318 AuthorsFormatter 131 3

0667 BasePanel 935 2

0667 BasePanel 969 3

0667 JabRefDesktop 430 2

0669 OpenDatabaseAction 268 2

0669 OpenDatabaseAction 433 4

0669 OpenDatabaseAction 451 4

0993 EntryEditor 717 2

0993 EntryEditor 720 2

0993 EntryEditor 723 2

0993 BibDatabase 187 2

0993 BibDatabase 456 2

1026 EntryEditor 1184 2

1026 BibtexParser 160 2

20Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

Classes Lines of Code Breakpoints

PdfReader 230 2

PdfReader 806 2

PdfReader 1923 2

ConsoleServicesFacade 89 2

ConsoleClient 81 2

PdfUtility 94 2

PdfUtility 96 2

PdfUtility 102 2

Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

Classes Lines of Code Breakpoints

icsUtils 333 3

Game 1751 2

ExamineController 41 2

ExamineController 84 3

ExamineController 87 2

ExamineController 92 2

When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

Swarm Debugging the Collective Intelligence on Interactive Debugging 21

Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

Classes Lines of Code Breakpoints

BibtexParser 138151159 222

160165168 323

176198199299 2222

EntryEditor 717720721 342

723837842 232

11841393 32

BibDatabase 175187223456 2326

OpenDatabaseAction 433450451 424

JabRefDesktop 4084430 223

SaveDatabaseAction 177188 42

BasePanel 935969 25

AuthorsFormatter 43131 54

EntryTableTransferHandler 346 2

FieldTextMenu 84 2

JabRefFrame 1119 2

JabRefMain 8 5

URLUtil 95 2

Fig 6 Methods with 5 or more breakpoints

Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

22Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 9 Study 1 - Breakpoints by class across different tasks

Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

SaveDatabaseAction Yes Yes Yes 7 2

BasePanel Yes Yes Yes Yes 14 7

JabRefDesktop Yes Yes 9 4

EntryEditor Yes Yes Yes 36 4

BibtexParser Yes Yes Yes 44 6

OpenDatabaseAction Yes Yes Yes 19 13

JabRef Yes Yes Yes 3 3

JabRefMain Yes Yes Yes Yes 5 4

URLUtil Yes Yes 4 2

BibDatabase Yes Yes Yes 19 4

points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

6 Evaluation of Swarm Debugging using GV

To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

Swarm Debugging the Collective Intelligence on Interactive Debugging 23

61 Study design

The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

611 Subject System

For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

612 Participants

Fig 7 Java expertise

To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

20 httpwwwjabreforg21 httpswwwfreelancercom

24Please give a shorter version with authorrunning and titlerunning prior to maketitle

613 Task Description

We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

614 Artifacts and Working Environment

After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

615 Study Procedure

The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

22 The full qualitative evaluation survey is available on httpsgooglforms

c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

Swarm Debugging the Collective Intelligence on Interactive Debugging 25

group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

616 Data Collection

In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

62 Results

We now discuss the results of our evaluation

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

26Please give a shorter version with authorrunning and titlerunning prior to maketitle

number of participants who could propose a solution and the correctness ofthe solutions

For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

Fig 8 GV for Task 0318

For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

Fig 9 GV for Task 0667

Swarm Debugging the Collective Intelligence on Interactive Debugging 27

Fig 10 GV for Task 0669

Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

13

Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

28Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 11 GV usefulness - experimental phase one

Fig 12 GV usefulness - experimental phase two

The analysis of our results suggests that GV is useful to support software-maintenance tasks

Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

Swarm Debugging the Collective Intelligence on Interactive Debugging 29

Table 10 Results from control and experimental groups (average)

Task 0993

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000255 000340 -44 126

Time to start 000444 000518 -33 112

Elapsed time 003008 001605 843 53

Task 1026

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000242 000448 -126 177

Time to start 000402 000343 19 92

Elapsed time 002458 002041 257 83

63 Comparing Results from the Control and Experimental Groups

We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 10: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

10Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 3 GV elements - Types (nodes) invocations (edge) and Task filter area

Debugging Views All the implementation details of SDI are available in theAppendix section

41 Swarm Debugging Global View

Swarm Debugging Global View (GV) is a call graph for modeling softwarebased on directed call graph [31] to explicit the hierarchical relationship byinvocated methods This visualization use rounded gray boxes (Figure 3-A) torepresent types or classes (nodes) and oriented arrows (Figure 3-B) to expressinvocations (edges) GV is built using previous debugging session context datacollected by developers for different tasks

GV was implemented using CytoscapeJS [32] a Graph API JavaScriptframework applying an automatic layout manager breadthfirst As a web appli-cation the SD visualisations can be integrated into an Eclipse view as an SWTBrowser Widget or accessed through a traditional browser such as MozillaFirefox or Google Chrome

In this view the grey boxes are types that developers visited during debug-ging sessions The edges represent method calls (Step Into or F5 on Eclipse)performed by all developers in all traced tasks on a software project Eachedge colour represents a task and line thickness is proportional to the numberof invocations Each debugging session contributes with a context generat-ing the visualisation combining all collected invocations The visualisation isorganised in layers or stacks and each line is a layer of invocations The start-ing points (non-invoked methods) are allocated on top of a tree the adjacent

Swarm Debugging the Collective Intelligence on Interactive Debugging 11

Fig 4 GV on all tasks

nodes in an invocation sequence Besides developers can directly go to a typein the Eclipse Editor by double-clicking over a node in the diagram In the leftcorner developers can use radio buttons to filter invocations by task (figure 3-C) showing the paths used by developers during previous debugging sessionsby a task Finally developers can use the mouse to pan and zoom inout onthe visualisation Figure 4 shows an example of GV with all tasks for JabRefsystem and we have data about 8 tasks

GV is a contextual visualization that shows only the paths explicitlyand intentionally visited by developers including type declarations andmethod invocations explored by developers based on their decisions

5 Using SDI to Understand Debugging Activities

The first benefit of SDI is the fact that it allows for collecting detailed in-formation about debugging sessions Using this information researchers caninvestigate developers behaviors during debugging activities To illustrate thispoint we conducted two experiments using SDI to understand developers de-bugging habits the times and effort with which they set breakpoints and thelocations where they set breakpoints

Our analysis builds upon three independent sets of observations involvingin total three systems Studies 1 and 2 involved JabRef PDFSaM and Raptoras subject systems We analysed 45 video-recorded debugging sessions avail-able from our own collected videos (Study 1) and an empirical study performedby Jiang et al [33] (Study 2)

In this study we answered the following research questions

RQ1 Is there a correlation between the time of the first breakpoint and a de-bugging taskrsquos elapsed time

RQ2 What is the effort in time for setting the first breakpoint in relation to thedebugging taskrsquos elapsed time

12Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

In this section we elaborate more on each of the studies

51 Study 1 Observational Study on JabRef

511 Subject System

To conduct this first study we selected JabRef12 version 32 as subject sys-tem This choice was motivated by the fact that JabRefrsquos domain is easy tounderstand thus reducing any learning effect It is composed of relatively inde-pendent packages and classes ie high cohesion low coupling thus reducingthe potential commingle effect of low code quality

512 Participants

We recruited eight male professional developers via an Internet-based free-lancer service13 Two participants are experts and three are intermediate inJava Developers self-reported their expertise levels which thus should betaken with caution Also we recruited 12 undergraduate and graduate stu-dents at Polytechnique Montreal to participate in our study We surveyedall the participantsrsquo background information before the study14 The surveyincluded questions about participantsrsquo self-assessment on their level of pro-gramming expertise (Java IDE and Eclipse) gender first natural languageschooling level and knowledge about TDD interactive debugging and whyusually they use a debugger All participants stated that they had experiencein Java and worked regularly with the debugger of Eclipse

513 Task Description

We selected five defects reported in the issue-tracking system of JabRef Wechose the task of fixing the faults that would potentially require developers toset breakpoints in different Java classes To ensure this we manually conductedthe debugging ourselves and verified that for understanding the root causeof the faults we had to set at least two breakpoints during our interactivedebugging sessions Then we asked participants to find the locations of thefaults described in Issues 318 667 669 993 and 1026 Table 1 summarisesthe faults using their titles from the issue-tracking system

12 httpwwwjabreforg13 httpswwwfreelancercom14 Survey available on httpsgooglformsdxCQaBke2l2cqjB42

Swarm Debugging the Collective Intelligence on Interactive Debugging 13

Table 1 Summary of the issues considered in JabRef in Study 1

Issues Summaries

318 ldquoNormalize to Bibtex name formatrdquo

667 ldquohashpound sign causes URL link to failrdquo

669 ldquoJabRef 3132 writes bib file in a format

that it will not readrdquo

993 ldquoIssues in BibTeX source opens save dialog

and opens dialog Problem with parsing entryrsquo

multiple timesrdquo

1026 ldquoJabref removes comments

inside the Bibtex coderdquo

514 Artifacts and Working Environment

We provided the participants with a tutorial15 explaining how to install andconfigure the tools required for the study and how to use them through awarm-up task We also presented a video16 to guide the participants during thewarm-up task In a second document we described the five faults and the stepsto reproduce them We also provided participants with a video demonstratingstep-by-step how to reproduce the five defects to help them get started

We provided a pre-configured Eclipse workspace to the participants andasked them to install Java 8 Eclipse Mars 2 with the Swarm Debugging Tracerplug-in [17] to collect automatically breakpoint-related events The Eclipseworkspace contained two Java projects a Tetris game for the warm-up taskand JabRef v32 for the study We also required that the participants installand configure the Open Broadcaster Software17 (OBS) open-source softwarefor live streaming and recording We used the OBS to record the participantsrsquoscreens

515 Study Procedure

After installing their environments we asked participants to perform a warm-up task with a Tetris game The task consisted of starting a debugging sessionsetting a breakpoint and debugging the Tetris program to locate a givenmethod We used this task to confirm that the participantsrsquo environmentswere properly configured and also to accustom the participants with the studysettings It was a trivial task that we also used to filter the participants whowould have too little knowledge of Java Eclipse and Eclipse Java debugger

15 httpswarmdebuggingorgpublication16 httpsyoutubeU1sBMpfL2jc17 httpsobsprojectcom

14Please give a shorter version with authorrunning and titlerunning prior to maketitle

All participants who participated in our study correctly executed the warm-uptask

After performing the warm-up task each participant performed debuggingto locate the faults We established a maximum limit of one-hour per task andinformed the participants that the task would require about 20 minutes foreach fault which we will discuss as a possible threat to validity We based thislimit on previous experiences with these tasks during mock trials After theparticipants performed each task we asked them to answer a post-experimentquestionnaire to collect information about the study asking if they found thefaults where were the faults why the faults happened if they were tired anda general summary of their debugging experience

516 Data Collection

The Swarm Debugging Tracer plug-in automatically and transparently col-lected all debugging data (breakpoints stepping method invocations) Alsowe recorded the participantrsquos screens during their debugging sessions withOBS We collected the following data

ndash 28 video recordings one per participant and task which are essential tocontrol the quality of each session and to produce a reliable and repro-ducible chain of evidence for our results

ndash The statements (lines in the source code) where the participants set break-points We considered the following types of statements because they arerepresentative of the main concepts in any programming languagesndash call methodfunction invocationsndash return returns of valuesndash assignment settings of valuesndash if-statement conditional statementsndash while-loop loops iterations

ndash Summaries of the results of the study one per participant via a question-naire which included the following questionsndash Did you locate the faultndash Where was the faultndash Why did the fault happenndash Were you tiredndash How was your debugging experience

Based on this data we obtained or computed the following metrics perparticipant and task

ndash Start Time (ST ) the timestamp when the participant started a task Weanalysed each video and we started to count when effectively the partic-ipant started a task ie when she started the Swarm Debugging Tracerplug-in for example

ndash Time of First Breakpoint (FB) the time when the participant set her firstbreakpoint

ndash End time (T ) the time when the participant finished a task

Swarm Debugging the Collective Intelligence on Interactive Debugging 15

ndash Elapsed End time (ET ) ET = T minus STndash Elapsed Time First Breakpoint (EF ) EF = FB minus ST

We manually verified whether participants were successful or not at com-pleting their tasks by analysing the answers provided in the questionnaireand the videos We knew the locations of the faults because all tasks weresolved by JabRefrsquos developers who completed the corresponding reports inthe issue-tracking system with the changes that they made

52 Study 2 Empirical Study on PDFSaM and Raptor

The second study consisted of the re-analysis of 20 videos of debugging sessionsavailable from an empirical study on change-impact analysis with professionaldevelopers [33] The authors conducted their work in two phases In the firstphase they asked nine developers to read two fault reports from two open-source systems and to fix these faults The objective was to observe the devel-opersrsquo behaviour as they fixed the faults In the second phase they analysedthe developersrsquo behaviour to determine whether the developers used any toolsfor change-impact analysis and if not whether they performed change-impactanalysis manually

The two systems analysed in their study are PDF Split and Merge18 (PDF-SaM) and Raptor19 They chose one fault report per system for their studyThey chose these systems due to their non-trivial size and because the pur-poses and domains of these systems were clear and easy to understand [33]The choice of the fault reports followed the criteria that they were alreadysolved and that they could be understood by developers who did not knowthe systems Alongside each fault report they presented the developers withinformation about the systems their purpose their main entry points andinstructions for replicating the faults

53 Results

As can be noticed Studies 1 and 2 have different approaches The tasks inStudy 1 were fault location tasks developers did not correct the faults whilethe ones in Study 2 were fault correction tasks Moreover Study 1 exploredfive different faults while Study 2 only analysed one fault per system Thecollected data provide a diversity of cases and allow a rich in-depth view ofhow developers set breakpoints during different debugging sessions

In the following we present the results regarding each research questionaddressed in the two studies

18 httpwwwpdfsamorg19 httpscodegooglecompraptor-chess-interface

16Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

MFB =EF

ET(1)

Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

Tasks Average Times (min) Std Devs (min)

318 44 64

667 28 29

669 22 25

993 25 25

1026 25 17

PdfSam 54 18

Raptor 59 13

Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

Swarm Debugging the Collective Intelligence on Interactive Debugging 17

RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

f(x) =α

xβ(2)

where α = 12 and β = 044

Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

This finding also corroborates previous results found with a different set oftasks [17]

18Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

Table 3 Study 1 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 111 53

if-statement 39 19

assignment 36 17

return 18 10

while-loop 3 1

Table 4 Study 2 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 43 43

if-statement 22 22

assignment 27 27

return 4 4

while-loop 4 4

13

Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

Swarm Debugging the Collective Intelligence on Interactive Debugging 19

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

JabRefDesktopopenExternalViewer(metaData()

linktoString() field)

Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

Tasks Classes Lines of Code Breakpoints

0318 AuthorsFormatter 43 5

0318 AuthorsFormatter 131 3

0667 BasePanel 935 2

0667 BasePanel 969 3

0667 JabRefDesktop 430 2

0669 OpenDatabaseAction 268 2

0669 OpenDatabaseAction 433 4

0669 OpenDatabaseAction 451 4

0993 EntryEditor 717 2

0993 EntryEditor 720 2

0993 EntryEditor 723 2

0993 BibDatabase 187 2

0993 BibDatabase 456 2

1026 EntryEditor 1184 2

1026 BibtexParser 160 2

20Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

Classes Lines of Code Breakpoints

PdfReader 230 2

PdfReader 806 2

PdfReader 1923 2

ConsoleServicesFacade 89 2

ConsoleClient 81 2

PdfUtility 94 2

PdfUtility 96 2

PdfUtility 102 2

Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

Classes Lines of Code Breakpoints

icsUtils 333 3

Game 1751 2

ExamineController 41 2

ExamineController 84 3

ExamineController 87 2

ExamineController 92 2

When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

Swarm Debugging the Collective Intelligence on Interactive Debugging 21

Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

Classes Lines of Code Breakpoints

BibtexParser 138151159 222

160165168 323

176198199299 2222

EntryEditor 717720721 342

723837842 232

11841393 32

BibDatabase 175187223456 2326

OpenDatabaseAction 433450451 424

JabRefDesktop 4084430 223

SaveDatabaseAction 177188 42

BasePanel 935969 25

AuthorsFormatter 43131 54

EntryTableTransferHandler 346 2

FieldTextMenu 84 2

JabRefFrame 1119 2

JabRefMain 8 5

URLUtil 95 2

Fig 6 Methods with 5 or more breakpoints

Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

22Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 9 Study 1 - Breakpoints by class across different tasks

Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

SaveDatabaseAction Yes Yes Yes 7 2

BasePanel Yes Yes Yes Yes 14 7

JabRefDesktop Yes Yes 9 4

EntryEditor Yes Yes Yes 36 4

BibtexParser Yes Yes Yes 44 6

OpenDatabaseAction Yes Yes Yes 19 13

JabRef Yes Yes Yes 3 3

JabRefMain Yes Yes Yes Yes 5 4

URLUtil Yes Yes 4 2

BibDatabase Yes Yes Yes 19 4

points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

6 Evaluation of Swarm Debugging using GV

To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

Swarm Debugging the Collective Intelligence on Interactive Debugging 23

61 Study design

The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

611 Subject System

For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

612 Participants

Fig 7 Java expertise

To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

20 httpwwwjabreforg21 httpswwwfreelancercom

24Please give a shorter version with authorrunning and titlerunning prior to maketitle

613 Task Description

We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

614 Artifacts and Working Environment

After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

615 Study Procedure

The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

22 The full qualitative evaluation survey is available on httpsgooglforms

c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

Swarm Debugging the Collective Intelligence on Interactive Debugging 25

group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

616 Data Collection

In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

62 Results

We now discuss the results of our evaluation

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

26Please give a shorter version with authorrunning and titlerunning prior to maketitle

number of participants who could propose a solution and the correctness ofthe solutions

For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

Fig 8 GV for Task 0318

For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

Fig 9 GV for Task 0667

Swarm Debugging the Collective Intelligence on Interactive Debugging 27

Fig 10 GV for Task 0669

Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

13

Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

28Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 11 GV usefulness - experimental phase one

Fig 12 GV usefulness - experimental phase two

The analysis of our results suggests that GV is useful to support software-maintenance tasks

Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

Swarm Debugging the Collective Intelligence on Interactive Debugging 29

Table 10 Results from control and experimental groups (average)

Task 0993

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000255 000340 -44 126

Time to start 000444 000518 -33 112

Elapsed time 003008 001605 843 53

Task 1026

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000242 000448 -126 177

Time to start 000402 000343 19 92

Elapsed time 002458 002041 257 83

63 Comparing Results from the Control and Experimental Groups

We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 11: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

Swarm Debugging the Collective Intelligence on Interactive Debugging 11

Fig 4 GV on all tasks

nodes in an invocation sequence Besides developers can directly go to a typein the Eclipse Editor by double-clicking over a node in the diagram In the leftcorner developers can use radio buttons to filter invocations by task (figure 3-C) showing the paths used by developers during previous debugging sessionsby a task Finally developers can use the mouse to pan and zoom inout onthe visualisation Figure 4 shows an example of GV with all tasks for JabRefsystem and we have data about 8 tasks

GV is a contextual visualization that shows only the paths explicitlyand intentionally visited by developers including type declarations andmethod invocations explored by developers based on their decisions

5 Using SDI to Understand Debugging Activities

The first benefit of SDI is the fact that it allows for collecting detailed in-formation about debugging sessions Using this information researchers caninvestigate developers behaviors during debugging activities To illustrate thispoint we conducted two experiments using SDI to understand developers de-bugging habits the times and effort with which they set breakpoints and thelocations where they set breakpoints

Our analysis builds upon three independent sets of observations involvingin total three systems Studies 1 and 2 involved JabRef PDFSaM and Raptoras subject systems We analysed 45 video-recorded debugging sessions avail-able from our own collected videos (Study 1) and an empirical study performedby Jiang et al [33] (Study 2)

In this study we answered the following research questions

RQ1 Is there a correlation between the time of the first breakpoint and a de-bugging taskrsquos elapsed time

RQ2 What is the effort in time for setting the first breakpoint in relation to thedebugging taskrsquos elapsed time

12Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

In this section we elaborate more on each of the studies

51 Study 1 Observational Study on JabRef

511 Subject System

To conduct this first study we selected JabRef12 version 32 as subject sys-tem This choice was motivated by the fact that JabRefrsquos domain is easy tounderstand thus reducing any learning effect It is composed of relatively inde-pendent packages and classes ie high cohesion low coupling thus reducingthe potential commingle effect of low code quality

512 Participants

We recruited eight male professional developers via an Internet-based free-lancer service13 Two participants are experts and three are intermediate inJava Developers self-reported their expertise levels which thus should betaken with caution Also we recruited 12 undergraduate and graduate stu-dents at Polytechnique Montreal to participate in our study We surveyedall the participantsrsquo background information before the study14 The surveyincluded questions about participantsrsquo self-assessment on their level of pro-gramming expertise (Java IDE and Eclipse) gender first natural languageschooling level and knowledge about TDD interactive debugging and whyusually they use a debugger All participants stated that they had experiencein Java and worked regularly with the debugger of Eclipse

513 Task Description

We selected five defects reported in the issue-tracking system of JabRef Wechose the task of fixing the faults that would potentially require developers toset breakpoints in different Java classes To ensure this we manually conductedthe debugging ourselves and verified that for understanding the root causeof the faults we had to set at least two breakpoints during our interactivedebugging sessions Then we asked participants to find the locations of thefaults described in Issues 318 667 669 993 and 1026 Table 1 summarisesthe faults using their titles from the issue-tracking system

12 httpwwwjabreforg13 httpswwwfreelancercom14 Survey available on httpsgooglformsdxCQaBke2l2cqjB42

Swarm Debugging the Collective Intelligence on Interactive Debugging 13

Table 1 Summary of the issues considered in JabRef in Study 1

Issues Summaries

318 ldquoNormalize to Bibtex name formatrdquo

667 ldquohashpound sign causes URL link to failrdquo

669 ldquoJabRef 3132 writes bib file in a format

that it will not readrdquo

993 ldquoIssues in BibTeX source opens save dialog

and opens dialog Problem with parsing entryrsquo

multiple timesrdquo

1026 ldquoJabref removes comments

inside the Bibtex coderdquo

514 Artifacts and Working Environment

We provided the participants with a tutorial15 explaining how to install andconfigure the tools required for the study and how to use them through awarm-up task We also presented a video16 to guide the participants during thewarm-up task In a second document we described the five faults and the stepsto reproduce them We also provided participants with a video demonstratingstep-by-step how to reproduce the five defects to help them get started

We provided a pre-configured Eclipse workspace to the participants andasked them to install Java 8 Eclipse Mars 2 with the Swarm Debugging Tracerplug-in [17] to collect automatically breakpoint-related events The Eclipseworkspace contained two Java projects a Tetris game for the warm-up taskand JabRef v32 for the study We also required that the participants installand configure the Open Broadcaster Software17 (OBS) open-source softwarefor live streaming and recording We used the OBS to record the participantsrsquoscreens

515 Study Procedure

After installing their environments we asked participants to perform a warm-up task with a Tetris game The task consisted of starting a debugging sessionsetting a breakpoint and debugging the Tetris program to locate a givenmethod We used this task to confirm that the participantsrsquo environmentswere properly configured and also to accustom the participants with the studysettings It was a trivial task that we also used to filter the participants whowould have too little knowledge of Java Eclipse and Eclipse Java debugger

15 httpswarmdebuggingorgpublication16 httpsyoutubeU1sBMpfL2jc17 httpsobsprojectcom

14Please give a shorter version with authorrunning and titlerunning prior to maketitle

All participants who participated in our study correctly executed the warm-uptask

After performing the warm-up task each participant performed debuggingto locate the faults We established a maximum limit of one-hour per task andinformed the participants that the task would require about 20 minutes foreach fault which we will discuss as a possible threat to validity We based thislimit on previous experiences with these tasks during mock trials After theparticipants performed each task we asked them to answer a post-experimentquestionnaire to collect information about the study asking if they found thefaults where were the faults why the faults happened if they were tired anda general summary of their debugging experience

516 Data Collection

The Swarm Debugging Tracer plug-in automatically and transparently col-lected all debugging data (breakpoints stepping method invocations) Alsowe recorded the participantrsquos screens during their debugging sessions withOBS We collected the following data

ndash 28 video recordings one per participant and task which are essential tocontrol the quality of each session and to produce a reliable and repro-ducible chain of evidence for our results

ndash The statements (lines in the source code) where the participants set break-points We considered the following types of statements because they arerepresentative of the main concepts in any programming languagesndash call methodfunction invocationsndash return returns of valuesndash assignment settings of valuesndash if-statement conditional statementsndash while-loop loops iterations

ndash Summaries of the results of the study one per participant via a question-naire which included the following questionsndash Did you locate the faultndash Where was the faultndash Why did the fault happenndash Were you tiredndash How was your debugging experience

Based on this data we obtained or computed the following metrics perparticipant and task

ndash Start Time (ST ) the timestamp when the participant started a task Weanalysed each video and we started to count when effectively the partic-ipant started a task ie when she started the Swarm Debugging Tracerplug-in for example

ndash Time of First Breakpoint (FB) the time when the participant set her firstbreakpoint

ndash End time (T ) the time when the participant finished a task

Swarm Debugging the Collective Intelligence on Interactive Debugging 15

ndash Elapsed End time (ET ) ET = T minus STndash Elapsed Time First Breakpoint (EF ) EF = FB minus ST

We manually verified whether participants were successful or not at com-pleting their tasks by analysing the answers provided in the questionnaireand the videos We knew the locations of the faults because all tasks weresolved by JabRefrsquos developers who completed the corresponding reports inthe issue-tracking system with the changes that they made

52 Study 2 Empirical Study on PDFSaM and Raptor

The second study consisted of the re-analysis of 20 videos of debugging sessionsavailable from an empirical study on change-impact analysis with professionaldevelopers [33] The authors conducted their work in two phases In the firstphase they asked nine developers to read two fault reports from two open-source systems and to fix these faults The objective was to observe the devel-opersrsquo behaviour as they fixed the faults In the second phase they analysedthe developersrsquo behaviour to determine whether the developers used any toolsfor change-impact analysis and if not whether they performed change-impactanalysis manually

The two systems analysed in their study are PDF Split and Merge18 (PDF-SaM) and Raptor19 They chose one fault report per system for their studyThey chose these systems due to their non-trivial size and because the pur-poses and domains of these systems were clear and easy to understand [33]The choice of the fault reports followed the criteria that they were alreadysolved and that they could be understood by developers who did not knowthe systems Alongside each fault report they presented the developers withinformation about the systems their purpose their main entry points andinstructions for replicating the faults

53 Results

As can be noticed Studies 1 and 2 have different approaches The tasks inStudy 1 were fault location tasks developers did not correct the faults whilethe ones in Study 2 were fault correction tasks Moreover Study 1 exploredfive different faults while Study 2 only analysed one fault per system Thecollected data provide a diversity of cases and allow a rich in-depth view ofhow developers set breakpoints during different debugging sessions

In the following we present the results regarding each research questionaddressed in the two studies

18 httpwwwpdfsamorg19 httpscodegooglecompraptor-chess-interface

16Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

MFB =EF

ET(1)

Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

Tasks Average Times (min) Std Devs (min)

318 44 64

667 28 29

669 22 25

993 25 25

1026 25 17

PdfSam 54 18

Raptor 59 13

Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

Swarm Debugging the Collective Intelligence on Interactive Debugging 17

RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

f(x) =α

xβ(2)

where α = 12 and β = 044

Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

This finding also corroborates previous results found with a different set oftasks [17]

18Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

Table 3 Study 1 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 111 53

if-statement 39 19

assignment 36 17

return 18 10

while-loop 3 1

Table 4 Study 2 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 43 43

if-statement 22 22

assignment 27 27

return 4 4

while-loop 4 4

13

Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

Swarm Debugging the Collective Intelligence on Interactive Debugging 19

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

JabRefDesktopopenExternalViewer(metaData()

linktoString() field)

Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

Tasks Classes Lines of Code Breakpoints

0318 AuthorsFormatter 43 5

0318 AuthorsFormatter 131 3

0667 BasePanel 935 2

0667 BasePanel 969 3

0667 JabRefDesktop 430 2

0669 OpenDatabaseAction 268 2

0669 OpenDatabaseAction 433 4

0669 OpenDatabaseAction 451 4

0993 EntryEditor 717 2

0993 EntryEditor 720 2

0993 EntryEditor 723 2

0993 BibDatabase 187 2

0993 BibDatabase 456 2

1026 EntryEditor 1184 2

1026 BibtexParser 160 2

20Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

Classes Lines of Code Breakpoints

PdfReader 230 2

PdfReader 806 2

PdfReader 1923 2

ConsoleServicesFacade 89 2

ConsoleClient 81 2

PdfUtility 94 2

PdfUtility 96 2

PdfUtility 102 2

Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

Classes Lines of Code Breakpoints

icsUtils 333 3

Game 1751 2

ExamineController 41 2

ExamineController 84 3

ExamineController 87 2

ExamineController 92 2

When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

Swarm Debugging the Collective Intelligence on Interactive Debugging 21

Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

Classes Lines of Code Breakpoints

BibtexParser 138151159 222

160165168 323

176198199299 2222

EntryEditor 717720721 342

723837842 232

11841393 32

BibDatabase 175187223456 2326

OpenDatabaseAction 433450451 424

JabRefDesktop 4084430 223

SaveDatabaseAction 177188 42

BasePanel 935969 25

AuthorsFormatter 43131 54

EntryTableTransferHandler 346 2

FieldTextMenu 84 2

JabRefFrame 1119 2

JabRefMain 8 5

URLUtil 95 2

Fig 6 Methods with 5 or more breakpoints

Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

22Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 9 Study 1 - Breakpoints by class across different tasks

Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

SaveDatabaseAction Yes Yes Yes 7 2

BasePanel Yes Yes Yes Yes 14 7

JabRefDesktop Yes Yes 9 4

EntryEditor Yes Yes Yes 36 4

BibtexParser Yes Yes Yes 44 6

OpenDatabaseAction Yes Yes Yes 19 13

JabRef Yes Yes Yes 3 3

JabRefMain Yes Yes Yes Yes 5 4

URLUtil Yes Yes 4 2

BibDatabase Yes Yes Yes 19 4

points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

6 Evaluation of Swarm Debugging using GV

To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

Swarm Debugging the Collective Intelligence on Interactive Debugging 23

61 Study design

The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

611 Subject System

For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

612 Participants

Fig 7 Java expertise

To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

20 httpwwwjabreforg21 httpswwwfreelancercom

24Please give a shorter version with authorrunning and titlerunning prior to maketitle

613 Task Description

We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

614 Artifacts and Working Environment

After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

615 Study Procedure

The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

22 The full qualitative evaluation survey is available on httpsgooglforms

c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

Swarm Debugging the Collective Intelligence on Interactive Debugging 25

group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

616 Data Collection

In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

62 Results

We now discuss the results of our evaluation

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

26Please give a shorter version with authorrunning and titlerunning prior to maketitle

number of participants who could propose a solution and the correctness ofthe solutions

For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

Fig 8 GV for Task 0318

For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

Fig 9 GV for Task 0667

Swarm Debugging the Collective Intelligence on Interactive Debugging 27

Fig 10 GV for Task 0669

Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

13

Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

28Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 11 GV usefulness - experimental phase one

Fig 12 GV usefulness - experimental phase two

The analysis of our results suggests that GV is useful to support software-maintenance tasks

Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

Swarm Debugging the Collective Intelligence on Interactive Debugging 29

Table 10 Results from control and experimental groups (average)

Task 0993

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000255 000340 -44 126

Time to start 000444 000518 -33 112

Elapsed time 003008 001605 843 53

Task 1026

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000242 000448 -126 177

Time to start 000402 000343 19 92

Elapsed time 002458 002041 257 83

63 Comparing Results from the Control and Experimental Groups

We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 12: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

12Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

In this section we elaborate more on each of the studies

51 Study 1 Observational Study on JabRef

511 Subject System

To conduct this first study we selected JabRef12 version 32 as subject sys-tem This choice was motivated by the fact that JabRefrsquos domain is easy tounderstand thus reducing any learning effect It is composed of relatively inde-pendent packages and classes ie high cohesion low coupling thus reducingthe potential commingle effect of low code quality

512 Participants

We recruited eight male professional developers via an Internet-based free-lancer service13 Two participants are experts and three are intermediate inJava Developers self-reported their expertise levels which thus should betaken with caution Also we recruited 12 undergraduate and graduate stu-dents at Polytechnique Montreal to participate in our study We surveyedall the participantsrsquo background information before the study14 The surveyincluded questions about participantsrsquo self-assessment on their level of pro-gramming expertise (Java IDE and Eclipse) gender first natural languageschooling level and knowledge about TDD interactive debugging and whyusually they use a debugger All participants stated that they had experiencein Java and worked regularly with the debugger of Eclipse

513 Task Description

We selected five defects reported in the issue-tracking system of JabRef Wechose the task of fixing the faults that would potentially require developers toset breakpoints in different Java classes To ensure this we manually conductedthe debugging ourselves and verified that for understanding the root causeof the faults we had to set at least two breakpoints during our interactivedebugging sessions Then we asked participants to find the locations of thefaults described in Issues 318 667 669 993 and 1026 Table 1 summarisesthe faults using their titles from the issue-tracking system

12 httpwwwjabreforg13 httpswwwfreelancercom14 Survey available on httpsgooglformsdxCQaBke2l2cqjB42

Swarm Debugging the Collective Intelligence on Interactive Debugging 13

Table 1 Summary of the issues considered in JabRef in Study 1

Issues Summaries

318 ldquoNormalize to Bibtex name formatrdquo

667 ldquohashpound sign causes URL link to failrdquo

669 ldquoJabRef 3132 writes bib file in a format

that it will not readrdquo

993 ldquoIssues in BibTeX source opens save dialog

and opens dialog Problem with parsing entryrsquo

multiple timesrdquo

1026 ldquoJabref removes comments

inside the Bibtex coderdquo

514 Artifacts and Working Environment

We provided the participants with a tutorial15 explaining how to install andconfigure the tools required for the study and how to use them through awarm-up task We also presented a video16 to guide the participants during thewarm-up task In a second document we described the five faults and the stepsto reproduce them We also provided participants with a video demonstratingstep-by-step how to reproduce the five defects to help them get started

We provided a pre-configured Eclipse workspace to the participants andasked them to install Java 8 Eclipse Mars 2 with the Swarm Debugging Tracerplug-in [17] to collect automatically breakpoint-related events The Eclipseworkspace contained two Java projects a Tetris game for the warm-up taskand JabRef v32 for the study We also required that the participants installand configure the Open Broadcaster Software17 (OBS) open-source softwarefor live streaming and recording We used the OBS to record the participantsrsquoscreens

515 Study Procedure

After installing their environments we asked participants to perform a warm-up task with a Tetris game The task consisted of starting a debugging sessionsetting a breakpoint and debugging the Tetris program to locate a givenmethod We used this task to confirm that the participantsrsquo environmentswere properly configured and also to accustom the participants with the studysettings It was a trivial task that we also used to filter the participants whowould have too little knowledge of Java Eclipse and Eclipse Java debugger

15 httpswarmdebuggingorgpublication16 httpsyoutubeU1sBMpfL2jc17 httpsobsprojectcom

14Please give a shorter version with authorrunning and titlerunning prior to maketitle

All participants who participated in our study correctly executed the warm-uptask

After performing the warm-up task each participant performed debuggingto locate the faults We established a maximum limit of one-hour per task andinformed the participants that the task would require about 20 minutes foreach fault which we will discuss as a possible threat to validity We based thislimit on previous experiences with these tasks during mock trials After theparticipants performed each task we asked them to answer a post-experimentquestionnaire to collect information about the study asking if they found thefaults where were the faults why the faults happened if they were tired anda general summary of their debugging experience

516 Data Collection

The Swarm Debugging Tracer plug-in automatically and transparently col-lected all debugging data (breakpoints stepping method invocations) Alsowe recorded the participantrsquos screens during their debugging sessions withOBS We collected the following data

ndash 28 video recordings one per participant and task which are essential tocontrol the quality of each session and to produce a reliable and repro-ducible chain of evidence for our results

ndash The statements (lines in the source code) where the participants set break-points We considered the following types of statements because they arerepresentative of the main concepts in any programming languagesndash call methodfunction invocationsndash return returns of valuesndash assignment settings of valuesndash if-statement conditional statementsndash while-loop loops iterations

ndash Summaries of the results of the study one per participant via a question-naire which included the following questionsndash Did you locate the faultndash Where was the faultndash Why did the fault happenndash Were you tiredndash How was your debugging experience

Based on this data we obtained or computed the following metrics perparticipant and task

ndash Start Time (ST ) the timestamp when the participant started a task Weanalysed each video and we started to count when effectively the partic-ipant started a task ie when she started the Swarm Debugging Tracerplug-in for example

ndash Time of First Breakpoint (FB) the time when the participant set her firstbreakpoint

ndash End time (T ) the time when the participant finished a task

Swarm Debugging the Collective Intelligence on Interactive Debugging 15

ndash Elapsed End time (ET ) ET = T minus STndash Elapsed Time First Breakpoint (EF ) EF = FB minus ST

We manually verified whether participants were successful or not at com-pleting their tasks by analysing the answers provided in the questionnaireand the videos We knew the locations of the faults because all tasks weresolved by JabRefrsquos developers who completed the corresponding reports inthe issue-tracking system with the changes that they made

52 Study 2 Empirical Study on PDFSaM and Raptor

The second study consisted of the re-analysis of 20 videos of debugging sessionsavailable from an empirical study on change-impact analysis with professionaldevelopers [33] The authors conducted their work in two phases In the firstphase they asked nine developers to read two fault reports from two open-source systems and to fix these faults The objective was to observe the devel-opersrsquo behaviour as they fixed the faults In the second phase they analysedthe developersrsquo behaviour to determine whether the developers used any toolsfor change-impact analysis and if not whether they performed change-impactanalysis manually

The two systems analysed in their study are PDF Split and Merge18 (PDF-SaM) and Raptor19 They chose one fault report per system for their studyThey chose these systems due to their non-trivial size and because the pur-poses and domains of these systems were clear and easy to understand [33]The choice of the fault reports followed the criteria that they were alreadysolved and that they could be understood by developers who did not knowthe systems Alongside each fault report they presented the developers withinformation about the systems their purpose their main entry points andinstructions for replicating the faults

53 Results

As can be noticed Studies 1 and 2 have different approaches The tasks inStudy 1 were fault location tasks developers did not correct the faults whilethe ones in Study 2 were fault correction tasks Moreover Study 1 exploredfive different faults while Study 2 only analysed one fault per system Thecollected data provide a diversity of cases and allow a rich in-depth view ofhow developers set breakpoints during different debugging sessions

In the following we present the results regarding each research questionaddressed in the two studies

18 httpwwwpdfsamorg19 httpscodegooglecompraptor-chess-interface

16Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

MFB =EF

ET(1)

Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

Tasks Average Times (min) Std Devs (min)

318 44 64

667 28 29

669 22 25

993 25 25

1026 25 17

PdfSam 54 18

Raptor 59 13

Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

Swarm Debugging the Collective Intelligence on Interactive Debugging 17

RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

f(x) =α

xβ(2)

where α = 12 and β = 044

Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

This finding also corroborates previous results found with a different set oftasks [17]

18Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

Table 3 Study 1 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 111 53

if-statement 39 19

assignment 36 17

return 18 10

while-loop 3 1

Table 4 Study 2 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 43 43

if-statement 22 22

assignment 27 27

return 4 4

while-loop 4 4

13

Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

Swarm Debugging the Collective Intelligence on Interactive Debugging 19

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

JabRefDesktopopenExternalViewer(metaData()

linktoString() field)

Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

Tasks Classes Lines of Code Breakpoints

0318 AuthorsFormatter 43 5

0318 AuthorsFormatter 131 3

0667 BasePanel 935 2

0667 BasePanel 969 3

0667 JabRefDesktop 430 2

0669 OpenDatabaseAction 268 2

0669 OpenDatabaseAction 433 4

0669 OpenDatabaseAction 451 4

0993 EntryEditor 717 2

0993 EntryEditor 720 2

0993 EntryEditor 723 2

0993 BibDatabase 187 2

0993 BibDatabase 456 2

1026 EntryEditor 1184 2

1026 BibtexParser 160 2

20Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

Classes Lines of Code Breakpoints

PdfReader 230 2

PdfReader 806 2

PdfReader 1923 2

ConsoleServicesFacade 89 2

ConsoleClient 81 2

PdfUtility 94 2

PdfUtility 96 2

PdfUtility 102 2

Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

Classes Lines of Code Breakpoints

icsUtils 333 3

Game 1751 2

ExamineController 41 2

ExamineController 84 3

ExamineController 87 2

ExamineController 92 2

When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

Swarm Debugging the Collective Intelligence on Interactive Debugging 21

Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

Classes Lines of Code Breakpoints

BibtexParser 138151159 222

160165168 323

176198199299 2222

EntryEditor 717720721 342

723837842 232

11841393 32

BibDatabase 175187223456 2326

OpenDatabaseAction 433450451 424

JabRefDesktop 4084430 223

SaveDatabaseAction 177188 42

BasePanel 935969 25

AuthorsFormatter 43131 54

EntryTableTransferHandler 346 2

FieldTextMenu 84 2

JabRefFrame 1119 2

JabRefMain 8 5

URLUtil 95 2

Fig 6 Methods with 5 or more breakpoints

Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

22Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 9 Study 1 - Breakpoints by class across different tasks

Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

SaveDatabaseAction Yes Yes Yes 7 2

BasePanel Yes Yes Yes Yes 14 7

JabRefDesktop Yes Yes 9 4

EntryEditor Yes Yes Yes 36 4

BibtexParser Yes Yes Yes 44 6

OpenDatabaseAction Yes Yes Yes 19 13

JabRef Yes Yes Yes 3 3

JabRefMain Yes Yes Yes Yes 5 4

URLUtil Yes Yes 4 2

BibDatabase Yes Yes Yes 19 4

points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

6 Evaluation of Swarm Debugging using GV

To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

Swarm Debugging the Collective Intelligence on Interactive Debugging 23

61 Study design

The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

611 Subject System

For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

612 Participants

Fig 7 Java expertise

To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

20 httpwwwjabreforg21 httpswwwfreelancercom

24Please give a shorter version with authorrunning and titlerunning prior to maketitle

613 Task Description

We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

614 Artifacts and Working Environment

After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

615 Study Procedure

The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

22 The full qualitative evaluation survey is available on httpsgooglforms

c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

Swarm Debugging the Collective Intelligence on Interactive Debugging 25

group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

616 Data Collection

In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

62 Results

We now discuss the results of our evaluation

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

26Please give a shorter version with authorrunning and titlerunning prior to maketitle

number of participants who could propose a solution and the correctness ofthe solutions

For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

Fig 8 GV for Task 0318

For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

Fig 9 GV for Task 0667

Swarm Debugging the Collective Intelligence on Interactive Debugging 27

Fig 10 GV for Task 0669

Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

13

Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

28Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 11 GV usefulness - experimental phase one

Fig 12 GV usefulness - experimental phase two

The analysis of our results suggests that GV is useful to support software-maintenance tasks

Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

Swarm Debugging the Collective Intelligence on Interactive Debugging 29

Table 10 Results from control and experimental groups (average)

Task 0993

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000255 000340 -44 126

Time to start 000444 000518 -33 112

Elapsed time 003008 001605 843 53

Task 1026

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000242 000448 -126 177

Time to start 000402 000343 19 92

Elapsed time 002458 002041 257 83

63 Comparing Results from the Control and Experimental Groups

We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 13: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

Swarm Debugging the Collective Intelligence on Interactive Debugging 13

Table 1 Summary of the issues considered in JabRef in Study 1

Issues Summaries

318 ldquoNormalize to Bibtex name formatrdquo

667 ldquohashpound sign causes URL link to failrdquo

669 ldquoJabRef 3132 writes bib file in a format

that it will not readrdquo

993 ldquoIssues in BibTeX source opens save dialog

and opens dialog Problem with parsing entryrsquo

multiple timesrdquo

1026 ldquoJabref removes comments

inside the Bibtex coderdquo

514 Artifacts and Working Environment

We provided the participants with a tutorial15 explaining how to install andconfigure the tools required for the study and how to use them through awarm-up task We also presented a video16 to guide the participants during thewarm-up task In a second document we described the five faults and the stepsto reproduce them We also provided participants with a video demonstratingstep-by-step how to reproduce the five defects to help them get started

We provided a pre-configured Eclipse workspace to the participants andasked them to install Java 8 Eclipse Mars 2 with the Swarm Debugging Tracerplug-in [17] to collect automatically breakpoint-related events The Eclipseworkspace contained two Java projects a Tetris game for the warm-up taskand JabRef v32 for the study We also required that the participants installand configure the Open Broadcaster Software17 (OBS) open-source softwarefor live streaming and recording We used the OBS to record the participantsrsquoscreens

515 Study Procedure

After installing their environments we asked participants to perform a warm-up task with a Tetris game The task consisted of starting a debugging sessionsetting a breakpoint and debugging the Tetris program to locate a givenmethod We used this task to confirm that the participantsrsquo environmentswere properly configured and also to accustom the participants with the studysettings It was a trivial task that we also used to filter the participants whowould have too little knowledge of Java Eclipse and Eclipse Java debugger

15 httpswarmdebuggingorgpublication16 httpsyoutubeU1sBMpfL2jc17 httpsobsprojectcom

14Please give a shorter version with authorrunning and titlerunning prior to maketitle

All participants who participated in our study correctly executed the warm-uptask

After performing the warm-up task each participant performed debuggingto locate the faults We established a maximum limit of one-hour per task andinformed the participants that the task would require about 20 minutes foreach fault which we will discuss as a possible threat to validity We based thislimit on previous experiences with these tasks during mock trials After theparticipants performed each task we asked them to answer a post-experimentquestionnaire to collect information about the study asking if they found thefaults where were the faults why the faults happened if they were tired anda general summary of their debugging experience

516 Data Collection

The Swarm Debugging Tracer plug-in automatically and transparently col-lected all debugging data (breakpoints stepping method invocations) Alsowe recorded the participantrsquos screens during their debugging sessions withOBS We collected the following data

ndash 28 video recordings one per participant and task which are essential tocontrol the quality of each session and to produce a reliable and repro-ducible chain of evidence for our results

ndash The statements (lines in the source code) where the participants set break-points We considered the following types of statements because they arerepresentative of the main concepts in any programming languagesndash call methodfunction invocationsndash return returns of valuesndash assignment settings of valuesndash if-statement conditional statementsndash while-loop loops iterations

ndash Summaries of the results of the study one per participant via a question-naire which included the following questionsndash Did you locate the faultndash Where was the faultndash Why did the fault happenndash Were you tiredndash How was your debugging experience

Based on this data we obtained or computed the following metrics perparticipant and task

ndash Start Time (ST ) the timestamp when the participant started a task Weanalysed each video and we started to count when effectively the partic-ipant started a task ie when she started the Swarm Debugging Tracerplug-in for example

ndash Time of First Breakpoint (FB) the time when the participant set her firstbreakpoint

ndash End time (T ) the time when the participant finished a task

Swarm Debugging the Collective Intelligence on Interactive Debugging 15

ndash Elapsed End time (ET ) ET = T minus STndash Elapsed Time First Breakpoint (EF ) EF = FB minus ST

We manually verified whether participants were successful or not at com-pleting their tasks by analysing the answers provided in the questionnaireand the videos We knew the locations of the faults because all tasks weresolved by JabRefrsquos developers who completed the corresponding reports inthe issue-tracking system with the changes that they made

52 Study 2 Empirical Study on PDFSaM and Raptor

The second study consisted of the re-analysis of 20 videos of debugging sessionsavailable from an empirical study on change-impact analysis with professionaldevelopers [33] The authors conducted their work in two phases In the firstphase they asked nine developers to read two fault reports from two open-source systems and to fix these faults The objective was to observe the devel-opersrsquo behaviour as they fixed the faults In the second phase they analysedthe developersrsquo behaviour to determine whether the developers used any toolsfor change-impact analysis and if not whether they performed change-impactanalysis manually

The two systems analysed in their study are PDF Split and Merge18 (PDF-SaM) and Raptor19 They chose one fault report per system for their studyThey chose these systems due to their non-trivial size and because the pur-poses and domains of these systems were clear and easy to understand [33]The choice of the fault reports followed the criteria that they were alreadysolved and that they could be understood by developers who did not knowthe systems Alongside each fault report they presented the developers withinformation about the systems their purpose their main entry points andinstructions for replicating the faults

53 Results

As can be noticed Studies 1 and 2 have different approaches The tasks inStudy 1 were fault location tasks developers did not correct the faults whilethe ones in Study 2 were fault correction tasks Moreover Study 1 exploredfive different faults while Study 2 only analysed one fault per system Thecollected data provide a diversity of cases and allow a rich in-depth view ofhow developers set breakpoints during different debugging sessions

In the following we present the results regarding each research questionaddressed in the two studies

18 httpwwwpdfsamorg19 httpscodegooglecompraptor-chess-interface

16Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

MFB =EF

ET(1)

Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

Tasks Average Times (min) Std Devs (min)

318 44 64

667 28 29

669 22 25

993 25 25

1026 25 17

PdfSam 54 18

Raptor 59 13

Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

Swarm Debugging the Collective Intelligence on Interactive Debugging 17

RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

f(x) =α

xβ(2)

where α = 12 and β = 044

Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

This finding also corroborates previous results found with a different set oftasks [17]

18Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

Table 3 Study 1 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 111 53

if-statement 39 19

assignment 36 17

return 18 10

while-loop 3 1

Table 4 Study 2 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 43 43

if-statement 22 22

assignment 27 27

return 4 4

while-loop 4 4

13

Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

Swarm Debugging the Collective Intelligence on Interactive Debugging 19

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

JabRefDesktopopenExternalViewer(metaData()

linktoString() field)

Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

Tasks Classes Lines of Code Breakpoints

0318 AuthorsFormatter 43 5

0318 AuthorsFormatter 131 3

0667 BasePanel 935 2

0667 BasePanel 969 3

0667 JabRefDesktop 430 2

0669 OpenDatabaseAction 268 2

0669 OpenDatabaseAction 433 4

0669 OpenDatabaseAction 451 4

0993 EntryEditor 717 2

0993 EntryEditor 720 2

0993 EntryEditor 723 2

0993 BibDatabase 187 2

0993 BibDatabase 456 2

1026 EntryEditor 1184 2

1026 BibtexParser 160 2

20Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

Classes Lines of Code Breakpoints

PdfReader 230 2

PdfReader 806 2

PdfReader 1923 2

ConsoleServicesFacade 89 2

ConsoleClient 81 2

PdfUtility 94 2

PdfUtility 96 2

PdfUtility 102 2

Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

Classes Lines of Code Breakpoints

icsUtils 333 3

Game 1751 2

ExamineController 41 2

ExamineController 84 3

ExamineController 87 2

ExamineController 92 2

When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

Swarm Debugging the Collective Intelligence on Interactive Debugging 21

Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

Classes Lines of Code Breakpoints

BibtexParser 138151159 222

160165168 323

176198199299 2222

EntryEditor 717720721 342

723837842 232

11841393 32

BibDatabase 175187223456 2326

OpenDatabaseAction 433450451 424

JabRefDesktop 4084430 223

SaveDatabaseAction 177188 42

BasePanel 935969 25

AuthorsFormatter 43131 54

EntryTableTransferHandler 346 2

FieldTextMenu 84 2

JabRefFrame 1119 2

JabRefMain 8 5

URLUtil 95 2

Fig 6 Methods with 5 or more breakpoints

Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

22Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 9 Study 1 - Breakpoints by class across different tasks

Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

SaveDatabaseAction Yes Yes Yes 7 2

BasePanel Yes Yes Yes Yes 14 7

JabRefDesktop Yes Yes 9 4

EntryEditor Yes Yes Yes 36 4

BibtexParser Yes Yes Yes 44 6

OpenDatabaseAction Yes Yes Yes 19 13

JabRef Yes Yes Yes 3 3

JabRefMain Yes Yes Yes Yes 5 4

URLUtil Yes Yes 4 2

BibDatabase Yes Yes Yes 19 4

points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

6 Evaluation of Swarm Debugging using GV

To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

Swarm Debugging the Collective Intelligence on Interactive Debugging 23

61 Study design

The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

611 Subject System

For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

612 Participants

Fig 7 Java expertise

To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

20 httpwwwjabreforg21 httpswwwfreelancercom

24Please give a shorter version with authorrunning and titlerunning prior to maketitle

613 Task Description

We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

614 Artifacts and Working Environment

After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

615 Study Procedure

The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

22 The full qualitative evaluation survey is available on httpsgooglforms

c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

Swarm Debugging the Collective Intelligence on Interactive Debugging 25

group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

616 Data Collection

In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

62 Results

We now discuss the results of our evaluation

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

26Please give a shorter version with authorrunning and titlerunning prior to maketitle

number of participants who could propose a solution and the correctness ofthe solutions

For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

Fig 8 GV for Task 0318

For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

Fig 9 GV for Task 0667

Swarm Debugging the Collective Intelligence on Interactive Debugging 27

Fig 10 GV for Task 0669

Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

13

Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

28Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 11 GV usefulness - experimental phase one

Fig 12 GV usefulness - experimental phase two

The analysis of our results suggests that GV is useful to support software-maintenance tasks

Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

Swarm Debugging the Collective Intelligence on Interactive Debugging 29

Table 10 Results from control and experimental groups (average)

Task 0993

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000255 000340 -44 126

Time to start 000444 000518 -33 112

Elapsed time 003008 001605 843 53

Task 1026

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000242 000448 -126 177

Time to start 000402 000343 19 92

Elapsed time 002458 002041 257 83

63 Comparing Results from the Control and Experimental Groups

We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 14: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

14Please give a shorter version with authorrunning and titlerunning prior to maketitle

All participants who participated in our study correctly executed the warm-uptask

After performing the warm-up task each participant performed debuggingto locate the faults We established a maximum limit of one-hour per task andinformed the participants that the task would require about 20 minutes foreach fault which we will discuss as a possible threat to validity We based thislimit on previous experiences with these tasks during mock trials After theparticipants performed each task we asked them to answer a post-experimentquestionnaire to collect information about the study asking if they found thefaults where were the faults why the faults happened if they were tired anda general summary of their debugging experience

516 Data Collection

The Swarm Debugging Tracer plug-in automatically and transparently col-lected all debugging data (breakpoints stepping method invocations) Alsowe recorded the participantrsquos screens during their debugging sessions withOBS We collected the following data

ndash 28 video recordings one per participant and task which are essential tocontrol the quality of each session and to produce a reliable and repro-ducible chain of evidence for our results

ndash The statements (lines in the source code) where the participants set break-points We considered the following types of statements because they arerepresentative of the main concepts in any programming languagesndash call methodfunction invocationsndash return returns of valuesndash assignment settings of valuesndash if-statement conditional statementsndash while-loop loops iterations

ndash Summaries of the results of the study one per participant via a question-naire which included the following questionsndash Did you locate the faultndash Where was the faultndash Why did the fault happenndash Were you tiredndash How was your debugging experience

Based on this data we obtained or computed the following metrics perparticipant and task

ndash Start Time (ST ) the timestamp when the participant started a task Weanalysed each video and we started to count when effectively the partic-ipant started a task ie when she started the Swarm Debugging Tracerplug-in for example

ndash Time of First Breakpoint (FB) the time when the participant set her firstbreakpoint

ndash End time (T ) the time when the participant finished a task

Swarm Debugging the Collective Intelligence on Interactive Debugging 15

ndash Elapsed End time (ET ) ET = T minus STndash Elapsed Time First Breakpoint (EF ) EF = FB minus ST

We manually verified whether participants were successful or not at com-pleting their tasks by analysing the answers provided in the questionnaireand the videos We knew the locations of the faults because all tasks weresolved by JabRefrsquos developers who completed the corresponding reports inthe issue-tracking system with the changes that they made

52 Study 2 Empirical Study on PDFSaM and Raptor

The second study consisted of the re-analysis of 20 videos of debugging sessionsavailable from an empirical study on change-impact analysis with professionaldevelopers [33] The authors conducted their work in two phases In the firstphase they asked nine developers to read two fault reports from two open-source systems and to fix these faults The objective was to observe the devel-opersrsquo behaviour as they fixed the faults In the second phase they analysedthe developersrsquo behaviour to determine whether the developers used any toolsfor change-impact analysis and if not whether they performed change-impactanalysis manually

The two systems analysed in their study are PDF Split and Merge18 (PDF-SaM) and Raptor19 They chose one fault report per system for their studyThey chose these systems due to their non-trivial size and because the pur-poses and domains of these systems were clear and easy to understand [33]The choice of the fault reports followed the criteria that they were alreadysolved and that they could be understood by developers who did not knowthe systems Alongside each fault report they presented the developers withinformation about the systems their purpose their main entry points andinstructions for replicating the faults

53 Results

As can be noticed Studies 1 and 2 have different approaches The tasks inStudy 1 were fault location tasks developers did not correct the faults whilethe ones in Study 2 were fault correction tasks Moreover Study 1 exploredfive different faults while Study 2 only analysed one fault per system Thecollected data provide a diversity of cases and allow a rich in-depth view ofhow developers set breakpoints during different debugging sessions

In the following we present the results regarding each research questionaddressed in the two studies

18 httpwwwpdfsamorg19 httpscodegooglecompraptor-chess-interface

16Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

MFB =EF

ET(1)

Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

Tasks Average Times (min) Std Devs (min)

318 44 64

667 28 29

669 22 25

993 25 25

1026 25 17

PdfSam 54 18

Raptor 59 13

Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

Swarm Debugging the Collective Intelligence on Interactive Debugging 17

RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

f(x) =α

xβ(2)

where α = 12 and β = 044

Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

This finding also corroborates previous results found with a different set oftasks [17]

18Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

Table 3 Study 1 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 111 53

if-statement 39 19

assignment 36 17

return 18 10

while-loop 3 1

Table 4 Study 2 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 43 43

if-statement 22 22

assignment 27 27

return 4 4

while-loop 4 4

13

Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

Swarm Debugging the Collective Intelligence on Interactive Debugging 19

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

JabRefDesktopopenExternalViewer(metaData()

linktoString() field)

Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

Tasks Classes Lines of Code Breakpoints

0318 AuthorsFormatter 43 5

0318 AuthorsFormatter 131 3

0667 BasePanel 935 2

0667 BasePanel 969 3

0667 JabRefDesktop 430 2

0669 OpenDatabaseAction 268 2

0669 OpenDatabaseAction 433 4

0669 OpenDatabaseAction 451 4

0993 EntryEditor 717 2

0993 EntryEditor 720 2

0993 EntryEditor 723 2

0993 BibDatabase 187 2

0993 BibDatabase 456 2

1026 EntryEditor 1184 2

1026 BibtexParser 160 2

20Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

Classes Lines of Code Breakpoints

PdfReader 230 2

PdfReader 806 2

PdfReader 1923 2

ConsoleServicesFacade 89 2

ConsoleClient 81 2

PdfUtility 94 2

PdfUtility 96 2

PdfUtility 102 2

Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

Classes Lines of Code Breakpoints

icsUtils 333 3

Game 1751 2

ExamineController 41 2

ExamineController 84 3

ExamineController 87 2

ExamineController 92 2

When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

Swarm Debugging the Collective Intelligence on Interactive Debugging 21

Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

Classes Lines of Code Breakpoints

BibtexParser 138151159 222

160165168 323

176198199299 2222

EntryEditor 717720721 342

723837842 232

11841393 32

BibDatabase 175187223456 2326

OpenDatabaseAction 433450451 424

JabRefDesktop 4084430 223

SaveDatabaseAction 177188 42

BasePanel 935969 25

AuthorsFormatter 43131 54

EntryTableTransferHandler 346 2

FieldTextMenu 84 2

JabRefFrame 1119 2

JabRefMain 8 5

URLUtil 95 2

Fig 6 Methods with 5 or more breakpoints

Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

22Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 9 Study 1 - Breakpoints by class across different tasks

Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

SaveDatabaseAction Yes Yes Yes 7 2

BasePanel Yes Yes Yes Yes 14 7

JabRefDesktop Yes Yes 9 4

EntryEditor Yes Yes Yes 36 4

BibtexParser Yes Yes Yes 44 6

OpenDatabaseAction Yes Yes Yes 19 13

JabRef Yes Yes Yes 3 3

JabRefMain Yes Yes Yes Yes 5 4

URLUtil Yes Yes 4 2

BibDatabase Yes Yes Yes 19 4

points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

6 Evaluation of Swarm Debugging using GV

To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

Swarm Debugging the Collective Intelligence on Interactive Debugging 23

61 Study design

The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

611 Subject System

For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

612 Participants

Fig 7 Java expertise

To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

20 httpwwwjabreforg21 httpswwwfreelancercom

24Please give a shorter version with authorrunning and titlerunning prior to maketitle

613 Task Description

We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

614 Artifacts and Working Environment

After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

615 Study Procedure

The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

22 The full qualitative evaluation survey is available on httpsgooglforms

c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

Swarm Debugging the Collective Intelligence on Interactive Debugging 25

group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

616 Data Collection

In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

62 Results

We now discuss the results of our evaluation

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

26Please give a shorter version with authorrunning and titlerunning prior to maketitle

number of participants who could propose a solution and the correctness ofthe solutions

For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

Fig 8 GV for Task 0318

For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

Fig 9 GV for Task 0667

Swarm Debugging the Collective Intelligence on Interactive Debugging 27

Fig 10 GV for Task 0669

Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

13

Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

28Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 11 GV usefulness - experimental phase one

Fig 12 GV usefulness - experimental phase two

The analysis of our results suggests that GV is useful to support software-maintenance tasks

Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

Swarm Debugging the Collective Intelligence on Interactive Debugging 29

Table 10 Results from control and experimental groups (average)

Task 0993

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000255 000340 -44 126

Time to start 000444 000518 -33 112

Elapsed time 003008 001605 843 53

Task 1026

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000242 000448 -126 177

Time to start 000402 000343 19 92

Elapsed time 002458 002041 257 83

63 Comparing Results from the Control and Experimental Groups

We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 15: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

Swarm Debugging the Collective Intelligence on Interactive Debugging 15

ndash Elapsed End time (ET ) ET = T minus STndash Elapsed Time First Breakpoint (EF ) EF = FB minus ST

We manually verified whether participants were successful or not at com-pleting their tasks by analysing the answers provided in the questionnaireand the videos We knew the locations of the faults because all tasks weresolved by JabRefrsquos developers who completed the corresponding reports inthe issue-tracking system with the changes that they made

52 Study 2 Empirical Study on PDFSaM and Raptor

The second study consisted of the re-analysis of 20 videos of debugging sessionsavailable from an empirical study on change-impact analysis with professionaldevelopers [33] The authors conducted their work in two phases In the firstphase they asked nine developers to read two fault reports from two open-source systems and to fix these faults The objective was to observe the devel-opersrsquo behaviour as they fixed the faults In the second phase they analysedthe developersrsquo behaviour to determine whether the developers used any toolsfor change-impact analysis and if not whether they performed change-impactanalysis manually

The two systems analysed in their study are PDF Split and Merge18 (PDF-SaM) and Raptor19 They chose one fault report per system for their studyThey chose these systems due to their non-trivial size and because the pur-poses and domains of these systems were clear and easy to understand [33]The choice of the fault reports followed the criteria that they were alreadysolved and that they could be understood by developers who did not knowthe systems Alongside each fault report they presented the developers withinformation about the systems their purpose their main entry points andinstructions for replicating the faults

53 Results

As can be noticed Studies 1 and 2 have different approaches The tasks inStudy 1 were fault location tasks developers did not correct the faults whilethe ones in Study 2 were fault correction tasks Moreover Study 1 exploredfive different faults while Study 2 only analysed one fault per system Thecollected data provide a diversity of cases and allow a rich in-depth view ofhow developers set breakpoints during different debugging sessions

In the following we present the results regarding each research questionaddressed in the two studies

18 httpwwwpdfsamorg19 httpscodegooglecompraptor-chess-interface

16Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

MFB =EF

ET(1)

Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

Tasks Average Times (min) Std Devs (min)

318 44 64

667 28 29

669 22 25

993 25 25

1026 25 17

PdfSam 54 18

Raptor 59 13

Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

Swarm Debugging the Collective Intelligence on Interactive Debugging 17

RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

f(x) =α

xβ(2)

where α = 12 and β = 044

Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

This finding also corroborates previous results found with a different set oftasks [17]

18Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

Table 3 Study 1 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 111 53

if-statement 39 19

assignment 36 17

return 18 10

while-loop 3 1

Table 4 Study 2 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 43 43

if-statement 22 22

assignment 27 27

return 4 4

while-loop 4 4

13

Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

Swarm Debugging the Collective Intelligence on Interactive Debugging 19

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

JabRefDesktopopenExternalViewer(metaData()

linktoString() field)

Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

Tasks Classes Lines of Code Breakpoints

0318 AuthorsFormatter 43 5

0318 AuthorsFormatter 131 3

0667 BasePanel 935 2

0667 BasePanel 969 3

0667 JabRefDesktop 430 2

0669 OpenDatabaseAction 268 2

0669 OpenDatabaseAction 433 4

0669 OpenDatabaseAction 451 4

0993 EntryEditor 717 2

0993 EntryEditor 720 2

0993 EntryEditor 723 2

0993 BibDatabase 187 2

0993 BibDatabase 456 2

1026 EntryEditor 1184 2

1026 BibtexParser 160 2

20Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

Classes Lines of Code Breakpoints

PdfReader 230 2

PdfReader 806 2

PdfReader 1923 2

ConsoleServicesFacade 89 2

ConsoleClient 81 2

PdfUtility 94 2

PdfUtility 96 2

PdfUtility 102 2

Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

Classes Lines of Code Breakpoints

icsUtils 333 3

Game 1751 2

ExamineController 41 2

ExamineController 84 3

ExamineController 87 2

ExamineController 92 2

When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

Swarm Debugging the Collective Intelligence on Interactive Debugging 21

Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

Classes Lines of Code Breakpoints

BibtexParser 138151159 222

160165168 323

176198199299 2222

EntryEditor 717720721 342

723837842 232

11841393 32

BibDatabase 175187223456 2326

OpenDatabaseAction 433450451 424

JabRefDesktop 4084430 223

SaveDatabaseAction 177188 42

BasePanel 935969 25

AuthorsFormatter 43131 54

EntryTableTransferHandler 346 2

FieldTextMenu 84 2

JabRefFrame 1119 2

JabRefMain 8 5

URLUtil 95 2

Fig 6 Methods with 5 or more breakpoints

Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

22Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 9 Study 1 - Breakpoints by class across different tasks

Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

SaveDatabaseAction Yes Yes Yes 7 2

BasePanel Yes Yes Yes Yes 14 7

JabRefDesktop Yes Yes 9 4

EntryEditor Yes Yes Yes 36 4

BibtexParser Yes Yes Yes 44 6

OpenDatabaseAction Yes Yes Yes 19 13

JabRef Yes Yes Yes 3 3

JabRefMain Yes Yes Yes Yes 5 4

URLUtil Yes Yes 4 2

BibDatabase Yes Yes Yes 19 4

points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

6 Evaluation of Swarm Debugging using GV

To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

Swarm Debugging the Collective Intelligence on Interactive Debugging 23

61 Study design

The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

611 Subject System

For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

612 Participants

Fig 7 Java expertise

To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

20 httpwwwjabreforg21 httpswwwfreelancercom

24Please give a shorter version with authorrunning and titlerunning prior to maketitle

613 Task Description

We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

614 Artifacts and Working Environment

After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

615 Study Procedure

The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

22 The full qualitative evaluation survey is available on httpsgooglforms

c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

Swarm Debugging the Collective Intelligence on Interactive Debugging 25

group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

616 Data Collection

In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

62 Results

We now discuss the results of our evaluation

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

26Please give a shorter version with authorrunning and titlerunning prior to maketitle

number of participants who could propose a solution and the correctness ofthe solutions

For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

Fig 8 GV for Task 0318

For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

Fig 9 GV for Task 0667

Swarm Debugging the Collective Intelligence on Interactive Debugging 27

Fig 10 GV for Task 0669

Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

13

Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

28Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 11 GV usefulness - experimental phase one

Fig 12 GV usefulness - experimental phase two

The analysis of our results suggests that GV is useful to support software-maintenance tasks

Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

Swarm Debugging the Collective Intelligence on Interactive Debugging 29

Table 10 Results from control and experimental groups (average)

Task 0993

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000255 000340 -44 126

Time to start 000444 000518 -33 112

Elapsed time 003008 001605 843 53

Task 1026

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000242 000448 -126 177

Time to start 000402 000343 19 92

Elapsed time 002458 002041 257 83

63 Comparing Results from the Control and Experimental Groups

We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 16: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

16Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ1 Is there a correlation between the time of the first breakpoint and adebugging taskrsquos elapsed time

We normalised the elapsed time between the start of a debugging session andthe setting of the first breakpoint EF by dividing it by the total durationof the task ET to compare the performance of participants across tasks (seeEquation 1)

MFB =EF

ET(1)

Table 2 Elapsed time by task (average) - Study 1 (JabRef) and Study 2

Tasks Average Times (min) Std Devs (min)

318 44 64

667 28 29

669 22 25

993 25 25

1026 25 17

PdfSam 54 18

Raptor 59 13

Table 2 shows the average effort (in minutes) for each task We find inStudy 1 that on average participants spend 27 of the total task duration toset the first breakpoint (std dev 17) In Study 2 it took on average 23 ofthe task time to participants to set the first breakpoint (std dev 17)

We conclude that the effort for setting the firstbreakpoint takes near one-quarter of the total ef-fort of a single debugging sessiona So this effortis important and this result suggest that debuggingtime could be reduced by providing tool support forsetting breakpoints

a In fact there is a ldquodebugging taskrdquo that starts when adeveloper starts to investigate the issue to understand andsolve it There is also an ldquointeractive debugging sessionrdquothat starts when a developer sets their first breakpoint anddecides to run an application in ldquodebugging moderdquo Alsoa developer could need to conclude one debugging task inone-to-many interactive debugging sessions

Swarm Debugging the Collective Intelligence on Interactive Debugging 17

RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

f(x) =α

xβ(2)

where α = 12 and β = 044

Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

This finding also corroborates previous results found with a different set oftasks [17]

18Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

Table 3 Study 1 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 111 53

if-statement 39 19

assignment 36 17

return 18 10

while-loop 3 1

Table 4 Study 2 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 43 43

if-statement 22 22

assignment 27 27

return 4 4

while-loop 4 4

13

Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

Swarm Debugging the Collective Intelligence on Interactive Debugging 19

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

JabRefDesktopopenExternalViewer(metaData()

linktoString() field)

Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

Tasks Classes Lines of Code Breakpoints

0318 AuthorsFormatter 43 5

0318 AuthorsFormatter 131 3

0667 BasePanel 935 2

0667 BasePanel 969 3

0667 JabRefDesktop 430 2

0669 OpenDatabaseAction 268 2

0669 OpenDatabaseAction 433 4

0669 OpenDatabaseAction 451 4

0993 EntryEditor 717 2

0993 EntryEditor 720 2

0993 EntryEditor 723 2

0993 BibDatabase 187 2

0993 BibDatabase 456 2

1026 EntryEditor 1184 2

1026 BibtexParser 160 2

20Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

Classes Lines of Code Breakpoints

PdfReader 230 2

PdfReader 806 2

PdfReader 1923 2

ConsoleServicesFacade 89 2

ConsoleClient 81 2

PdfUtility 94 2

PdfUtility 96 2

PdfUtility 102 2

Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

Classes Lines of Code Breakpoints

icsUtils 333 3

Game 1751 2

ExamineController 41 2

ExamineController 84 3

ExamineController 87 2

ExamineController 92 2

When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

Swarm Debugging the Collective Intelligence on Interactive Debugging 21

Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

Classes Lines of Code Breakpoints

BibtexParser 138151159 222

160165168 323

176198199299 2222

EntryEditor 717720721 342

723837842 232

11841393 32

BibDatabase 175187223456 2326

OpenDatabaseAction 433450451 424

JabRefDesktop 4084430 223

SaveDatabaseAction 177188 42

BasePanel 935969 25

AuthorsFormatter 43131 54

EntryTableTransferHandler 346 2

FieldTextMenu 84 2

JabRefFrame 1119 2

JabRefMain 8 5

URLUtil 95 2

Fig 6 Methods with 5 or more breakpoints

Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

22Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 9 Study 1 - Breakpoints by class across different tasks

Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

SaveDatabaseAction Yes Yes Yes 7 2

BasePanel Yes Yes Yes Yes 14 7

JabRefDesktop Yes Yes 9 4

EntryEditor Yes Yes Yes 36 4

BibtexParser Yes Yes Yes 44 6

OpenDatabaseAction Yes Yes Yes 19 13

JabRef Yes Yes Yes 3 3

JabRefMain Yes Yes Yes Yes 5 4

URLUtil Yes Yes 4 2

BibDatabase Yes Yes Yes 19 4

points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

6 Evaluation of Swarm Debugging using GV

To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

Swarm Debugging the Collective Intelligence on Interactive Debugging 23

61 Study design

The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

611 Subject System

For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

612 Participants

Fig 7 Java expertise

To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

20 httpwwwjabreforg21 httpswwwfreelancercom

24Please give a shorter version with authorrunning and titlerunning prior to maketitle

613 Task Description

We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

614 Artifacts and Working Environment

After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

615 Study Procedure

The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

22 The full qualitative evaluation survey is available on httpsgooglforms

c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

Swarm Debugging the Collective Intelligence on Interactive Debugging 25

group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

616 Data Collection

In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

62 Results

We now discuss the results of our evaluation

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

26Please give a shorter version with authorrunning and titlerunning prior to maketitle

number of participants who could propose a solution and the correctness ofthe solutions

For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

Fig 8 GV for Task 0318

For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

Fig 9 GV for Task 0667

Swarm Debugging the Collective Intelligence on Interactive Debugging 27

Fig 10 GV for Task 0669

Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

13

Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

28Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 11 GV usefulness - experimental phase one

Fig 12 GV usefulness - experimental phase two

The analysis of our results suggests that GV is useful to support software-maintenance tasks

Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

Swarm Debugging the Collective Intelligence on Interactive Debugging 29

Table 10 Results from control and experimental groups (average)

Task 0993

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000255 000340 -44 126

Time to start 000444 000518 -33 112

Elapsed time 003008 001605 843 53

Task 1026

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000242 000448 -126 177

Time to start 000402 000343 19 92

Elapsed time 002458 002041 257 83

63 Comparing Results from the Control and Experimental Groups

We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 17: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

Swarm Debugging the Collective Intelligence on Interactive Debugging 17

RQ2 What is the effort in time for setting the first breakpoint in relation tothe debugging taskrsquos elapsed time

For each session we normalized the data using Equation 1 and associated theratios with their respective task elapsed times Figure 5 combines the data fromthe debugging sessions each point in the plot represents a debugging sessionwith a specific rate of breakpoints per minute Analysing the first breakpointdata we found a correlation between task elapsed time and time of the firstbreakpoint (ρ = minus047) resulting that task elapsed time is inversely correlatedto the time of taskrsquos first breakpoint

f(x) =α

xβ(2)

where α = 12 and β = 044

Fig 5 Relation between time of the first breakpoint and task elapsed time(data from the two studies)

We observe that when developers toggle break-points carefully they complete tasks faster thandevelopers who set breakpoints quickly

This finding also corroborates previous results found with a different set oftasks [17]

18Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

Table 3 Study 1 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 111 53

if-statement 39 19

assignment 36 17

return 18 10

while-loop 3 1

Table 4 Study 2 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 43 43

if-statement 22 22

assignment 27 27

return 4 4

while-loop 4 4

13

Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

Swarm Debugging the Collective Intelligence on Interactive Debugging 19

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

JabRefDesktopopenExternalViewer(metaData()

linktoString() field)

Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

Tasks Classes Lines of Code Breakpoints

0318 AuthorsFormatter 43 5

0318 AuthorsFormatter 131 3

0667 BasePanel 935 2

0667 BasePanel 969 3

0667 JabRefDesktop 430 2

0669 OpenDatabaseAction 268 2

0669 OpenDatabaseAction 433 4

0669 OpenDatabaseAction 451 4

0993 EntryEditor 717 2

0993 EntryEditor 720 2

0993 EntryEditor 723 2

0993 BibDatabase 187 2

0993 BibDatabase 456 2

1026 EntryEditor 1184 2

1026 BibtexParser 160 2

20Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

Classes Lines of Code Breakpoints

PdfReader 230 2

PdfReader 806 2

PdfReader 1923 2

ConsoleServicesFacade 89 2

ConsoleClient 81 2

PdfUtility 94 2

PdfUtility 96 2

PdfUtility 102 2

Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

Classes Lines of Code Breakpoints

icsUtils 333 3

Game 1751 2

ExamineController 41 2

ExamineController 84 3

ExamineController 87 2

ExamineController 92 2

When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

Swarm Debugging the Collective Intelligence on Interactive Debugging 21

Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

Classes Lines of Code Breakpoints

BibtexParser 138151159 222

160165168 323

176198199299 2222

EntryEditor 717720721 342

723837842 232

11841393 32

BibDatabase 175187223456 2326

OpenDatabaseAction 433450451 424

JabRefDesktop 4084430 223

SaveDatabaseAction 177188 42

BasePanel 935969 25

AuthorsFormatter 43131 54

EntryTableTransferHandler 346 2

FieldTextMenu 84 2

JabRefFrame 1119 2

JabRefMain 8 5

URLUtil 95 2

Fig 6 Methods with 5 or more breakpoints

Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

22Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 9 Study 1 - Breakpoints by class across different tasks

Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

SaveDatabaseAction Yes Yes Yes 7 2

BasePanel Yes Yes Yes Yes 14 7

JabRefDesktop Yes Yes 9 4

EntryEditor Yes Yes Yes 36 4

BibtexParser Yes Yes Yes 44 6

OpenDatabaseAction Yes Yes Yes 19 13

JabRef Yes Yes Yes 3 3

JabRefMain Yes Yes Yes Yes 5 4

URLUtil Yes Yes 4 2

BibDatabase Yes Yes Yes 19 4

points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

6 Evaluation of Swarm Debugging using GV

To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

Swarm Debugging the Collective Intelligence on Interactive Debugging 23

61 Study design

The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

611 Subject System

For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

612 Participants

Fig 7 Java expertise

To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

20 httpwwwjabreforg21 httpswwwfreelancercom

24Please give a shorter version with authorrunning and titlerunning prior to maketitle

613 Task Description

We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

614 Artifacts and Working Environment

After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

615 Study Procedure

The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

22 The full qualitative evaluation survey is available on httpsgooglforms

c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

Swarm Debugging the Collective Intelligence on Interactive Debugging 25

group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

616 Data Collection

In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

62 Results

We now discuss the results of our evaluation

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

26Please give a shorter version with authorrunning and titlerunning prior to maketitle

number of participants who could propose a solution and the correctness ofthe solutions

For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

Fig 8 GV for Task 0318

For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

Fig 9 GV for Task 0667

Swarm Debugging the Collective Intelligence on Interactive Debugging 27

Fig 10 GV for Task 0669

Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

13

Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

28Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 11 GV usefulness - experimental phase one

Fig 12 GV usefulness - experimental phase two

The analysis of our results suggests that GV is useful to support software-maintenance tasks

Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

Swarm Debugging the Collective Intelligence on Interactive Debugging 29

Table 10 Results from control and experimental groups (average)

Task 0993

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000255 000340 -44 126

Time to start 000444 000518 -33 112

Elapsed time 003008 001605 843 53

Task 1026

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000242 000448 -126 177

Time to start 000402 000343 19 92

Elapsed time 002458 002041 257 83

63 Comparing Results from the Control and Experimental Groups

We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 18: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

18Please give a shorter version with authorrunning and titlerunning prior to maketitle

RQ3 Are there consistent common trends with respect to the types of state-ments on which developers set breakpoints

We classified the types of statements on which the participants set their break-points and analysed each breakpoint For Study 1 Table 3 shows for examplethat 53 (111207) of the breakpoints are set on call statements while only1 (3207) are set on while-loop statements For Study 2 Table 4 shows sim-ilar trends 43 (43100) of breakpoints are set on call statements and only4 (3207) on while-loop statements The only difference is on assignmentstatements where in Study 1 we found 17 while Study 2 showed 27 Aftergrouping if-statement return and while-loop into control-flow statements wefound that 30 of breakpoints are on control-flow statements while 53 areon call statements and 17 on assignments

Table 3 Study 1 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 111 53

if-statement 39 19

assignment 36 17

return 18 10

while-loop 3 1

Table 4 Study 2 - Breakpoints per type of statement

Statements Numbers of Breakpoints

call 43 43

if-statement 22 22

assignment 27 27

return 4 4

while-loop 4 4

13

Our results show that in both studies 50 ofthe breakpoints were set on call statements whilecontrol-flow related statements were comparativelyfewer being the while-loop statement the leastcommon (2-4)

Swarm Debugging the Collective Intelligence on Interactive Debugging 19

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

JabRefDesktopopenExternalViewer(metaData()

linktoString() field)

Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

Tasks Classes Lines of Code Breakpoints

0318 AuthorsFormatter 43 5

0318 AuthorsFormatter 131 3

0667 BasePanel 935 2

0667 BasePanel 969 3

0667 JabRefDesktop 430 2

0669 OpenDatabaseAction 268 2

0669 OpenDatabaseAction 433 4

0669 OpenDatabaseAction 451 4

0993 EntryEditor 717 2

0993 EntryEditor 720 2

0993 EntryEditor 723 2

0993 BibDatabase 187 2

0993 BibDatabase 456 2

1026 EntryEditor 1184 2

1026 BibtexParser 160 2

20Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

Classes Lines of Code Breakpoints

PdfReader 230 2

PdfReader 806 2

PdfReader 1923 2

ConsoleServicesFacade 89 2

ConsoleClient 81 2

PdfUtility 94 2

PdfUtility 96 2

PdfUtility 102 2

Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

Classes Lines of Code Breakpoints

icsUtils 333 3

Game 1751 2

ExamineController 41 2

ExamineController 84 3

ExamineController 87 2

ExamineController 92 2

When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

Swarm Debugging the Collective Intelligence on Interactive Debugging 21

Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

Classes Lines of Code Breakpoints

BibtexParser 138151159 222

160165168 323

176198199299 2222

EntryEditor 717720721 342

723837842 232

11841393 32

BibDatabase 175187223456 2326

OpenDatabaseAction 433450451 424

JabRefDesktop 4084430 223

SaveDatabaseAction 177188 42

BasePanel 935969 25

AuthorsFormatter 43131 54

EntryTableTransferHandler 346 2

FieldTextMenu 84 2

JabRefFrame 1119 2

JabRefMain 8 5

URLUtil 95 2

Fig 6 Methods with 5 or more breakpoints

Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

22Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 9 Study 1 - Breakpoints by class across different tasks

Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

SaveDatabaseAction Yes Yes Yes 7 2

BasePanel Yes Yes Yes Yes 14 7

JabRefDesktop Yes Yes 9 4

EntryEditor Yes Yes Yes 36 4

BibtexParser Yes Yes Yes 44 6

OpenDatabaseAction Yes Yes Yes 19 13

JabRef Yes Yes Yes 3 3

JabRefMain Yes Yes Yes Yes 5 4

URLUtil Yes Yes 4 2

BibDatabase Yes Yes Yes 19 4

points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

6 Evaluation of Swarm Debugging using GV

To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

Swarm Debugging the Collective Intelligence on Interactive Debugging 23

61 Study design

The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

611 Subject System

For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

612 Participants

Fig 7 Java expertise

To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

20 httpwwwjabreforg21 httpswwwfreelancercom

24Please give a shorter version with authorrunning and titlerunning prior to maketitle

613 Task Description

We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

614 Artifacts and Working Environment

After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

615 Study Procedure

The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

22 The full qualitative evaluation survey is available on httpsgooglforms

c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

Swarm Debugging the Collective Intelligence on Interactive Debugging 25

group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

616 Data Collection

In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

62 Results

We now discuss the results of our evaluation

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

26Please give a shorter version with authorrunning and titlerunning prior to maketitle

number of participants who could propose a solution and the correctness ofthe solutions

For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

Fig 8 GV for Task 0318

For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

Fig 9 GV for Task 0667

Swarm Debugging the Collective Intelligence on Interactive Debugging 27

Fig 10 GV for Task 0669

Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

13

Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

28Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 11 GV usefulness - experimental phase one

Fig 12 GV usefulness - experimental phase two

The analysis of our results suggests that GV is useful to support software-maintenance tasks

Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

Swarm Debugging the Collective Intelligence on Interactive Debugging 29

Table 10 Results from control and experimental groups (average)

Task 0993

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000255 000340 -44 126

Time to start 000444 000518 -33 112

Elapsed time 003008 001605 843 53

Task 1026

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000242 000448 -126 177

Time to start 000402 000343 19 92

Elapsed time 002458 002041 257 83

63 Comparing Results from the Control and Experimental Groups

We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 19: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

Swarm Debugging the Collective Intelligence on Interactive Debugging 19

RQ4 Are there consistent common trends with respect to the lines methodsor classes on which developers set breakpoints

We investigated each breakpoint to assess whether there were breakpoints onthe same line of code for different participants performing the same tasksie resolving the same fault by comparing the breakpoints on the same taskand different tasks We sorted all the breakpoints from our data by the Classin which they were set and line number and we counted how many times abreakpoint was set on exactly the same line of code across participants Wereport the results in Table 5 for Study 1 and in Tables 6 and 7 for Study 2

In Study 1 we found 15 lines of code with two or more breakpoints onthe same line for the same task by different participants In Study 2 we ob-served breakpoints on exactly the same lines for eight lines of code in PDFSaMand six in Raptor For example in Study 1 on line 969 in Class BasePanelparticipants set a breakpoint on

JabRefDesktopopenExternalViewer(metaData()

linktoString() field)

Three different participants set three breakpoints on that line for issue667 Tables 5 6 and 7 report all recurring breakpoints These observationsshow that participants do not choose breakpoints purposelessly as suggestedby Tiarks and Rohm [15] We suggest that there is an underlying rationaleon that decision because different participants set breakpoints on exactly thesame lines of code

Table 5 Study 1 - Breakpoints in the same line of code (JabRef) by task

Tasks Classes Lines of Code Breakpoints

0318 AuthorsFormatter 43 5

0318 AuthorsFormatter 131 3

0667 BasePanel 935 2

0667 BasePanel 969 3

0667 JabRefDesktop 430 2

0669 OpenDatabaseAction 268 2

0669 OpenDatabaseAction 433 4

0669 OpenDatabaseAction 451 4

0993 EntryEditor 717 2

0993 EntryEditor 720 2

0993 EntryEditor 723 2

0993 BibDatabase 187 2

0993 BibDatabase 456 2

1026 EntryEditor 1184 2

1026 BibtexParser 160 2

20Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

Classes Lines of Code Breakpoints

PdfReader 230 2

PdfReader 806 2

PdfReader 1923 2

ConsoleServicesFacade 89 2

ConsoleClient 81 2

PdfUtility 94 2

PdfUtility 96 2

PdfUtility 102 2

Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

Classes Lines of Code Breakpoints

icsUtils 333 3

Game 1751 2

ExamineController 41 2

ExamineController 84 3

ExamineController 87 2

ExamineController 92 2

When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

Swarm Debugging the Collective Intelligence on Interactive Debugging 21

Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

Classes Lines of Code Breakpoints

BibtexParser 138151159 222

160165168 323

176198199299 2222

EntryEditor 717720721 342

723837842 232

11841393 32

BibDatabase 175187223456 2326

OpenDatabaseAction 433450451 424

JabRefDesktop 4084430 223

SaveDatabaseAction 177188 42

BasePanel 935969 25

AuthorsFormatter 43131 54

EntryTableTransferHandler 346 2

FieldTextMenu 84 2

JabRefFrame 1119 2

JabRefMain 8 5

URLUtil 95 2

Fig 6 Methods with 5 or more breakpoints

Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

22Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 9 Study 1 - Breakpoints by class across different tasks

Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

SaveDatabaseAction Yes Yes Yes 7 2

BasePanel Yes Yes Yes Yes 14 7

JabRefDesktop Yes Yes 9 4

EntryEditor Yes Yes Yes 36 4

BibtexParser Yes Yes Yes 44 6

OpenDatabaseAction Yes Yes Yes 19 13

JabRef Yes Yes Yes 3 3

JabRefMain Yes Yes Yes Yes 5 4

URLUtil Yes Yes 4 2

BibDatabase Yes Yes Yes 19 4

points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

6 Evaluation of Swarm Debugging using GV

To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

Swarm Debugging the Collective Intelligence on Interactive Debugging 23

61 Study design

The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

611 Subject System

For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

612 Participants

Fig 7 Java expertise

To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

20 httpwwwjabreforg21 httpswwwfreelancercom

24Please give a shorter version with authorrunning and titlerunning prior to maketitle

613 Task Description

We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

614 Artifacts and Working Environment

After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

615 Study Procedure

The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

22 The full qualitative evaluation survey is available on httpsgooglforms

c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

Swarm Debugging the Collective Intelligence on Interactive Debugging 25

group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

616 Data Collection

In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

62 Results

We now discuss the results of our evaluation

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

26Please give a shorter version with authorrunning and titlerunning prior to maketitle

number of participants who could propose a solution and the correctness ofthe solutions

For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

Fig 8 GV for Task 0318

For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

Fig 9 GV for Task 0667

Swarm Debugging the Collective Intelligence on Interactive Debugging 27

Fig 10 GV for Task 0669

Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

13

Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

28Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 11 GV usefulness - experimental phase one

Fig 12 GV usefulness - experimental phase two

The analysis of our results suggests that GV is useful to support software-maintenance tasks

Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

Swarm Debugging the Collective Intelligence on Interactive Debugging 29

Table 10 Results from control and experimental groups (average)

Task 0993

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000255 000340 -44 126

Time to start 000444 000518 -33 112

Elapsed time 003008 001605 843 53

Task 1026

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000242 000448 -126 177

Time to start 000402 000343 19 92

Elapsed time 002458 002041 257 83

63 Comparing Results from the Control and Experimental Groups

We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 20: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

20Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 6 Study 2 - Breakpoints in the same line of code (PdfSam)

Classes Lines of Code Breakpoints

PdfReader 230 2

PdfReader 806 2

PdfReader 1923 2

ConsoleServicesFacade 89 2

ConsoleClient 81 2

PdfUtility 94 2

PdfUtility 96 2

PdfUtility 102 2

Table 7 Study 2 - Breakpoints in the same line of code (Raptor)

Classes Lines of Code Breakpoints

icsUtils 333 3

Game 1751 2

ExamineController 41 2

ExamineController 84 3

ExamineController 87 2

ExamineController 92 2

When analysing Table 8 we found 135 lines of code having two or morebreakpoints for different tasks by different participants For example five dif-ferent participants set five breakpoints on the line of code 969 in Class BaseP-anel independently of their tasks (in that case for three different tasks)This result suggests a potential opportunity to recommend those locations ascandidates for new debugging sessions

We also analysed if the same class received breakpoints for different tasksWe grouped all breakpoints by class and counted how many breakpoints wereset on the classes for different tasks putting ldquoYesrdquo if a type had a breakpointproducing Table 9 We also counted the numbers of breakpoints by type andhow many participants set breakpoints on a type

For Study 1 we observe that ten classes received breakpoints in differenttasks by different participants resulting in 77 (160207) of breakpoints Forexample class BibtexParser had 21 (44207) of breakpoints in 3 out of5 tasks by 13 different participants (This analysis only applies to Study 1because Study 2 has only one task per system thus not allowing to comparebreakpoints across tasks)

Swarm Debugging the Collective Intelligence on Interactive Debugging 21

Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

Classes Lines of Code Breakpoints

BibtexParser 138151159 222

160165168 323

176198199299 2222

EntryEditor 717720721 342

723837842 232

11841393 32

BibDatabase 175187223456 2326

OpenDatabaseAction 433450451 424

JabRefDesktop 4084430 223

SaveDatabaseAction 177188 42

BasePanel 935969 25

AuthorsFormatter 43131 54

EntryTableTransferHandler 346 2

FieldTextMenu 84 2

JabRefFrame 1119 2

JabRefMain 8 5

URLUtil 95 2

Fig 6 Methods with 5 or more breakpoints

Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

22Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 9 Study 1 - Breakpoints by class across different tasks

Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

SaveDatabaseAction Yes Yes Yes 7 2

BasePanel Yes Yes Yes Yes 14 7

JabRefDesktop Yes Yes 9 4

EntryEditor Yes Yes Yes 36 4

BibtexParser Yes Yes Yes 44 6

OpenDatabaseAction Yes Yes Yes 19 13

JabRef Yes Yes Yes 3 3

JabRefMain Yes Yes Yes Yes 5 4

URLUtil Yes Yes 4 2

BibDatabase Yes Yes Yes 19 4

points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

6 Evaluation of Swarm Debugging using GV

To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

Swarm Debugging the Collective Intelligence on Interactive Debugging 23

61 Study design

The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

611 Subject System

For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

612 Participants

Fig 7 Java expertise

To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

20 httpwwwjabreforg21 httpswwwfreelancercom

24Please give a shorter version with authorrunning and titlerunning prior to maketitle

613 Task Description

We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

614 Artifacts and Working Environment

After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

615 Study Procedure

The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

22 The full qualitative evaluation survey is available on httpsgooglforms

c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

Swarm Debugging the Collective Intelligence on Interactive Debugging 25

group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

616 Data Collection

In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

62 Results

We now discuss the results of our evaluation

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

26Please give a shorter version with authorrunning and titlerunning prior to maketitle

number of participants who could propose a solution and the correctness ofthe solutions

For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

Fig 8 GV for Task 0318

For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

Fig 9 GV for Task 0667

Swarm Debugging the Collective Intelligence on Interactive Debugging 27

Fig 10 GV for Task 0669

Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

13

Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

28Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 11 GV usefulness - experimental phase one

Fig 12 GV usefulness - experimental phase two

The analysis of our results suggests that GV is useful to support software-maintenance tasks

Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

Swarm Debugging the Collective Intelligence on Interactive Debugging 29

Table 10 Results from control and experimental groups (average)

Task 0993

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000255 000340 -44 126

Time to start 000444 000518 -33 112

Elapsed time 003008 001605 843 53

Task 1026

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000242 000448 -126 177

Time to start 000402 000343 19 92

Elapsed time 002458 002041 257 83

63 Comparing Results from the Control and Experimental Groups

We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 21: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

Swarm Debugging the Collective Intelligence on Interactive Debugging 21

Table 8 Study 1 - Breakpoints in the same line of code (JabRef) in all tasks

Classes Lines of Code Breakpoints

BibtexParser 138151159 222

160165168 323

176198199299 2222

EntryEditor 717720721 342

723837842 232

11841393 32

BibDatabase 175187223456 2326

OpenDatabaseAction 433450451 424

JabRefDesktop 4084430 223

SaveDatabaseAction 177188 42

BasePanel 935969 25

AuthorsFormatter 43131 54

EntryTableTransferHandler 346 2

FieldTextMenu 84 2

JabRefFrame 1119 2

JabRefMain 8 5

URLUtil 95 2

Fig 6 Methods with 5 or more breakpoints

Finally we count how many breakpoints are in the same method acrosstasks and participants indicating that there were ldquopreferredrdquo methods forsetting breakpoints independently of task or participant We find that 37methods received at least two breakpoints and 13 methods received five ormore breakpoints during different tasks by different developers as reported inFigure 6 In particular the method EntityEditorstoreSource received 24 break-

22Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 9 Study 1 - Breakpoints by class across different tasks

Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

SaveDatabaseAction Yes Yes Yes 7 2

BasePanel Yes Yes Yes Yes 14 7

JabRefDesktop Yes Yes 9 4

EntryEditor Yes Yes Yes 36 4

BibtexParser Yes Yes Yes 44 6

OpenDatabaseAction Yes Yes Yes 19 13

JabRef Yes Yes Yes 3 3

JabRefMain Yes Yes Yes Yes 5 4

URLUtil Yes Yes 4 2

BibDatabase Yes Yes Yes 19 4

points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

6 Evaluation of Swarm Debugging using GV

To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

Swarm Debugging the Collective Intelligence on Interactive Debugging 23

61 Study design

The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

611 Subject System

For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

612 Participants

Fig 7 Java expertise

To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

20 httpwwwjabreforg21 httpswwwfreelancercom

24Please give a shorter version with authorrunning and titlerunning prior to maketitle

613 Task Description

We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

614 Artifacts and Working Environment

After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

615 Study Procedure

The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

22 The full qualitative evaluation survey is available on httpsgooglforms

c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

Swarm Debugging the Collective Intelligence on Interactive Debugging 25

group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

616 Data Collection

In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

62 Results

We now discuss the results of our evaluation

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

26Please give a shorter version with authorrunning and titlerunning prior to maketitle

number of participants who could propose a solution and the correctness ofthe solutions

For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

Fig 8 GV for Task 0318

For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

Fig 9 GV for Task 0667

Swarm Debugging the Collective Intelligence on Interactive Debugging 27

Fig 10 GV for Task 0669

Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

13

Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

28Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 11 GV usefulness - experimental phase one

Fig 12 GV usefulness - experimental phase two

The analysis of our results suggests that GV is useful to support software-maintenance tasks

Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

Swarm Debugging the Collective Intelligence on Interactive Debugging 29

Table 10 Results from control and experimental groups (average)

Task 0993

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000255 000340 -44 126

Time to start 000444 000518 -33 112

Elapsed time 003008 001605 843 53

Task 1026

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000242 000448 -126 177

Time to start 000402 000343 19 92

Elapsed time 002458 002041 257 83

63 Comparing Results from the Control and Experimental Groups

We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 22: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

22Please give a shorter version with authorrunning and titlerunning prior to maketitle

Table 9 Study 1 - Breakpoints by class across different tasks

Types Issue 318 Issue 667 Issue 669 Issue 993 Issue 1026 Breakpoints Dev Diversities

SaveDatabaseAction Yes Yes Yes 7 2

BasePanel Yes Yes Yes Yes 14 7

JabRefDesktop Yes Yes 9 4

EntryEditor Yes Yes Yes 36 4

BibtexParser Yes Yes Yes 44 6

OpenDatabaseAction Yes Yes Yes 19 13

JabRef Yes Yes Yes 3 3

JabRefMain Yes Yes Yes Yes 5 4

URLUtil Yes Yes 4 2

BibDatabase Yes Yes Yes 19 4

points and the method BibtexParserparseFileContent received 20 breakpointsby different developers on different tasks

Our results suggest that developers do not choosebreakpoints lightly and there is a rationale intheir setting breakpoints because different devel-opers set breakpoints on the same line of code forthe same task and different developers set break-points on the same type or method for differenttasks Furthermore our results show that differentdevelopers for different tasks set breakpoints atthe same locations These results show the useful-ness of collecting and sharing breakpoints to assistdevelopers during maintenance tasks

6 Evaluation of Swarm Debugging using GV

To assess other benefits that our approach can bring to developers we con-ducted a controlled experiment and interviews focusing on analysing debuggingbehaviors from 30 professional developers We intended to evaluate if sharinginformation obtained in previous debugging sessions supports debugging tasksWe wish to answer the following two research questions

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debuggingtasks

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

Swarm Debugging the Collective Intelligence on Interactive Debugging 23

61 Study design

The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

611 Subject System

For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

612 Participants

Fig 7 Java expertise

To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

20 httpwwwjabreforg21 httpswwwfreelancercom

24Please give a shorter version with authorrunning and titlerunning prior to maketitle

613 Task Description

We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

614 Artifacts and Working Environment

After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

615 Study Procedure

The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

22 The full qualitative evaluation survey is available on httpsgooglforms

c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

Swarm Debugging the Collective Intelligence on Interactive Debugging 25

group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

616 Data Collection

In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

62 Results

We now discuss the results of our evaluation

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

26Please give a shorter version with authorrunning and titlerunning prior to maketitle

number of participants who could propose a solution and the correctness ofthe solutions

For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

Fig 8 GV for Task 0318

For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

Fig 9 GV for Task 0667

Swarm Debugging the Collective Intelligence on Interactive Debugging 27

Fig 10 GV for Task 0669

Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

13

Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

28Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 11 GV usefulness - experimental phase one

Fig 12 GV usefulness - experimental phase two

The analysis of our results suggests that GV is useful to support software-maintenance tasks

Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

Swarm Debugging the Collective Intelligence on Interactive Debugging 29

Table 10 Results from control and experimental groups (average)

Task 0993

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000255 000340 -44 126

Time to start 000444 000518 -33 112

Elapsed time 003008 001605 843 53

Task 1026

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000242 000448 -126 177

Time to start 000402 000343 19 92

Elapsed time 002458 002041 257 83

63 Comparing Results from the Control and Experimental Groups

We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 23: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

Swarm Debugging the Collective Intelligence on Interactive Debugging 23

61 Study design

The study consisted of two parts (1) a qualitative evaluation using GV ina browser and (2) a controlled experiment on fault location tasks in a Tetrisprogram using GV integrated into Eclipse The planning realization and someresults are presented in the following sections

611 Subject System

For this qualitative evaluation we chose JabRef20 as subject system JabRef isa reference management software developed in Java It is open-source and itsfaults are publicly reported Moreover JabRef is of reasonably good quality

612 Participants

Fig 7 Java expertise

To reproduce a realistic industry scenario we recruited 30 professionalfreelancer developers21 being 23 male and seven female Our participants haveon average six years of experience in software development (st dev four years)They have in average 48 years of Java experience (st dev 33 years) and 97used Eclipse As shown in Figure 7 67 are advanced or experts on Java

Among these professionals 23 participated in a qualitative evaluation (qual-itative evaluation of GV) and 11 participated in fault location (controlled ex-periment - 7 control and 6 experiment) using the Swarm Debugging GlobalView (GV) in Eclipse

20 httpwwwjabreforg21 httpswwwfreelancercom

24Please give a shorter version with authorrunning and titlerunning prior to maketitle

613 Task Description

We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

614 Artifacts and Working Environment

After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

615 Study Procedure

The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

22 The full qualitative evaluation survey is available on httpsgooglforms

c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

Swarm Debugging the Collective Intelligence on Interactive Debugging 25

group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

616 Data Collection

In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

62 Results

We now discuss the results of our evaluation

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

26Please give a shorter version with authorrunning and titlerunning prior to maketitle

number of participants who could propose a solution and the correctness ofthe solutions

For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

Fig 8 GV for Task 0318

For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

Fig 9 GV for Task 0667

Swarm Debugging the Collective Intelligence on Interactive Debugging 27

Fig 10 GV for Task 0669

Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

13

Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

28Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 11 GV usefulness - experimental phase one

Fig 12 GV usefulness - experimental phase two

The analysis of our results suggests that GV is useful to support software-maintenance tasks

Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

Swarm Debugging the Collective Intelligence on Interactive Debugging 29

Table 10 Results from control and experimental groups (average)

Task 0993

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000255 000340 -44 126

Time to start 000444 000518 -33 112

Elapsed time 003008 001605 843 53

Task 1026

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000242 000448 -126 177

Time to start 000402 000343 19 92

Elapsed time 002458 002041 257 83

63 Comparing Results from the Control and Experimental Groups

We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 24: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

24Please give a shorter version with authorrunning and titlerunning prior to maketitle

613 Task Description

We chose debugging tasks to trigger the participantsrsquo debugging sessions Weasked participants to find the locations of true faults in JabRef We picked 5faults reported against JabRef v32 in its issue-tracking system ie Issues 318993 1026 1173 1235 and 1251 We asked participants to find the locations ofthe faults asking questions as Where was the fault for Task 318 or For Task1173 where would you toggle a breakpoint to fix the fault and about positiveand negative aspects of GV 22

614 Artifacts and Working Environment

After the subjectrsquos profile survey we provided artifacts to support the twophases of our evaluation For phase one we provided an electronic form withinstructions to follow and questions to answer The GV was available athttpserverswarmdebuggingorg For phase two we provided partici-pants with two instruction documents The first document was an experimenttutorial23 that explained how to install and configure all tools to perform awarm-up task and the experimental study We also used the warm-up task toconfirm that the participantsrsquo environment was correctly configured and thatthe participants understood the instructions The warm-up task was describedusing a video to guide the participants We make this video available on-line24The second document was an electronic form to collect the results and otherassessments made using the integrated GV

For this experimental study we used Eclipse Mars 2 and Java 8 the SDIwith GV and its Swarm Debugging Tracer plug-in and two Java projects asmall Tetris game for the warm-up task and JabRef v32 for the experimentalstudy All participants received the same workspace provided by our artifactrepository

615 Study Procedure

The qualitative evaluation consisted of a set of questions about JabRef issuesusing GV on a regular Web browser without accessing the JabRef source codeWe asked the participants to identify the ldquotyperdquo (classes) in which the faultswere located for Issues 318 667 and 669 using only the GV We requiredan explanation for each answer In addition to providing information aboutthe usefulness of the GV for task comprehension this evaluation helped theparticipants to become familiar with the GV

The controlled experiment was a fault-location task in which we askedthe same participants to find the location of faults using the GV integratedinto their Eclipse IDE We divided the participants into two groups a control

22 The full qualitative evaluation survey is available on httpsgooglforms

c6lOS80TgI3i4tyI223 httpswarmdebuggingorgpublicationsexperimenttutorialhtml24 httpsyoutubeU1sBMpfL2jc

Swarm Debugging the Collective Intelligence on Interactive Debugging 25

group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

616 Data Collection

In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

62 Results

We now discuss the results of our evaluation

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

26Please give a shorter version with authorrunning and titlerunning prior to maketitle

number of participants who could propose a solution and the correctness ofthe solutions

For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

Fig 8 GV for Task 0318

For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

Fig 9 GV for Task 0667

Swarm Debugging the Collective Intelligence on Interactive Debugging 27

Fig 10 GV for Task 0669

Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

13

Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

28Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 11 GV usefulness - experimental phase one

Fig 12 GV usefulness - experimental phase two

The analysis of our results suggests that GV is useful to support software-maintenance tasks

Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

Swarm Debugging the Collective Intelligence on Interactive Debugging 29

Table 10 Results from control and experimental groups (average)

Task 0993

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000255 000340 -44 126

Time to start 000444 000518 -33 112

Elapsed time 003008 001605 843 53

Task 1026

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000242 000448 -126 177

Time to start 000402 000343 19 92

Elapsed time 002458 002041 257 83

63 Comparing Results from the Control and Experimental Groups

We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 25: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

Swarm Debugging the Collective Intelligence on Interactive Debugging 25

group (seven participants) and an experimental group (six participants) Par-ticipants from the control group performed fault location for Issues 993 and1026 without using the GV while those from the experimental group didthe same tasks using the GV

616 Data Collection

In the qualitative evaluation the participants answered the questions directlyin an electronic form They used the GV available on-line25 with collected datafor JabRef Issues 318 667 669

In the controlled experiment each participant executed the warm-up taskThis task consisted in starting a debugging session toggling a breakpointand debugging a Tetris program to locate a given method After the warm-up task each participant executed debugging sessions to find the locationof the faults described in the five issues We set a time constraint of onehour We asked participants to control their fatigue asking them to go tothe next task if they felt tired while informing us of this situation in theirreports Finally each participant filled a report to provide answers and otherinformation like whether they completed the tasks successfully or not and(just for the experimental group) commenting on the usefulness of GV duringeach task

All services were available on our server26 during the debugging sessionsand the experimental data were collected within three days We also cap-tured video from the participants obtaining more than 3 hours of debuggingThe experiment tutorial contained the instruction to install and set the OpenBroadcaster Software 27 for video recording tool

62 Results

We now discuss the results of our evaluation

RQ5 Is Swarm Debuggingrsquos Global View useful in terms of supporting debug-ging tasks

During the qualitative evaluation we asked the participants to analyse thegraph generated by GV to identify the type of the location of each faultwithout reading the task description or looking at the code The GVgenerated graph had invocations collected from previous debugging sessionsThese invocations were generated during ldquogoodrdquo sessions (the fault was found ndash2731 sessions) and ldquobadrdquo sessions (the fault was not found ndash 431 sessions)We analysed results obtained for Tasks 318 667 and 699 comparing the

25 httpserverswarmdebuggingorg26 httpserverswarmdebuggingorg27 OBS is available on httpsobsprojectcom

26Please give a shorter version with authorrunning and titlerunning prior to maketitle

number of participants who could propose a solution and the correctness ofthe solutions

For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

Fig 8 GV for Task 0318

For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

Fig 9 GV for Task 0667

Swarm Debugging the Collective Intelligence on Interactive Debugging 27

Fig 10 GV for Task 0669

Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

13

Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

28Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 11 GV usefulness - experimental phase one

Fig 12 GV usefulness - experimental phase two

The analysis of our results suggests that GV is useful to support software-maintenance tasks

Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

Swarm Debugging the Collective Intelligence on Interactive Debugging 29

Table 10 Results from control and experimental groups (average)

Task 0993

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000255 000340 -44 126

Time to start 000444 000518 -33 112

Elapsed time 003008 001605 843 53

Task 1026

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000242 000448 -126 177

Time to start 000402 000343 19 92

Elapsed time 002458 002041 257 83

63 Comparing Results from the Control and Experimental Groups

We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 26: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

26Please give a shorter version with authorrunning and titlerunning prior to maketitle

number of participants who could propose a solution and the correctness ofthe solutions

For Task 318 (Figure 8) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the location of the fault just by using the GV view Amongthese participants 52 (1223) suggested correctly AuthorsFormatteras the problematic type

Fig 8 GV for Task 0318

For Task 667 (Figure 9) 95 of participants (2223) could suggest a ldquocan-didaterdquo type for the problematic code just analysing the graph provided bythe GV Among these participants 31 (723) suggested correctly thatURLUtil was the problematic type

Fig 9 GV for Task 0667

Swarm Debugging the Collective Intelligence on Interactive Debugging 27

Fig 10 GV for Task 0669

Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

13

Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

28Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 11 GV usefulness - experimental phase one

Fig 12 GV usefulness - experimental phase two

The analysis of our results suggests that GV is useful to support software-maintenance tasks

Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

Swarm Debugging the Collective Intelligence on Interactive Debugging 29

Table 10 Results from control and experimental groups (average)

Task 0993

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000255 000340 -44 126

Time to start 000444 000518 -33 112

Elapsed time 003008 001605 843 53

Task 1026

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000242 000448 -126 177

Time to start 000402 000343 19 92

Elapsed time 002458 002041 257 83

63 Comparing Results from the Control and Experimental Groups

We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 27: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

Swarm Debugging the Collective Intelligence on Interactive Debugging 27

Fig 10 GV for Task 0669

Finally for Task 669 (Figure 10) again 95 of participants (2223) couldsuggest a ldquocandidaterdquo for the type in the problematic code just by lookingat the GV However none of them (ie 0 (023)) provided the correctanswer which was OpenDatabaseAction

13

Our results show that combining stepping paths ina graph visualisation from several debugging ses-sions help developers produce correct hypothesesabout fault locations without see the code previ-ously

RQ6 Is Swarm Debuggingrsquos Global View useful in terms of sharing debuggingtasks

We analysed each video recording and searched for evidence of GV utilisationduring fault-locations tasks Our controlled experiment showed that 100 ofparticipants of the experimental group used GV to support their tasks (videorecording analysis) navigating reorganizing and especially diving into thetype double-clicking on a selected type We asked participants if GV is usefulto support software maintenance tasks We report that 87 of participantsagreed that GV is useful or very useful (100 at least useful) throughour qualitative study (Figure 11) and 75 of participants claimed thatGV is useful or very useful (100 at least useful) on the task sur-vey after fault-location tasks (Figure 12) Furthermore several participantsrsquofeedback supports our answers

28Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 11 GV usefulness - experimental phase one

Fig 12 GV usefulness - experimental phase two

The analysis of our results suggests that GV is useful to support software-maintenance tasks

Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

Swarm Debugging the Collective Intelligence on Interactive Debugging 29

Table 10 Results from control and experimental groups (average)

Task 0993

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000255 000340 -44 126

Time to start 000444 000518 -33 112

Elapsed time 003008 001605 843 53

Task 1026

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000242 000448 -126 177

Time to start 000402 000343 19 92

Elapsed time 002458 002041 257 83

63 Comparing Results from the Control and Experimental Groups

We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 28: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

28Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 11 GV usefulness - experimental phase one

Fig 12 GV usefulness - experimental phase two

The analysis of our results suggests that GV is useful to support software-maintenance tasks

Sharing previous debugging sessions supports de-bugging hypotheses and consequently reduces theeffort on searching of code

Swarm Debugging the Collective Intelligence on Interactive Debugging 29

Table 10 Results from control and experimental groups (average)

Task 0993

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000255 000340 -44 126

Time to start 000444 000518 -33 112

Elapsed time 003008 001605 843 53

Task 1026

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000242 000448 -126 177

Time to start 000402 000343 19 92

Elapsed time 002458 002041 257 83

63 Comparing Results from the Control and Experimental Groups

We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 29: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

Swarm Debugging the Collective Intelligence on Interactive Debugging 29

Table 10 Results from control and experimental groups (average)

Task 0993

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000255 000340 -44 126

Time to start 000444 000518 -33 112

Elapsed time 003008 001605 843 53

Task 1026

Metric Control [C] Experiment [E] ∆ [C-E] (s) [EC]

First breakpoint 000242 000448 -126 177

Time to start 000402 000343 19 92

Elapsed time 002458 002041 257 83

63 Comparing Results from the Control and Experimental Groups

We compared the control and experimental groups using three metrics (1) thetime for setting the first breakpoint (2) the time to start a debugging sessionand (3) the elapsed time to finish the task We analysed recording sessionsof Tasks 0993 and 1026 compiling the average results from the two groups inTable 10

Observing the results in Table 10 we observed that the experimental groupspent more time to set the first breakpoint (26 more time for Task 0993 and77 more time for Task 1026) The times to start a debugging session are nearthe same (12 more time for Task 0993 and 18 less time for Task 1026) whencompared to the control group However participants who used our approachspent less time to finish both tasks (47 less time to Task 0993 and 17less time for Task 1026) This result suggests that participants invested moretime to toggle carefully the first breakpoint but consecutively completed thetasks faster than participants who toggled breakpoints quickly corroboratingour results in RQ2

Our results show that participants who used theshared debugging data invested more time to de-cide the first breakpoint but reduced their timeto finish the tasks These results suggest thatsharing debugging information using Swarm De-bugging can reduce the time spent on debuggingtasks

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 30: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

30Please give a shorter version with authorrunning and titlerunning prior to maketitle

64 Participantsrsquo Feedback

As with any visualisation technique proposed in the literature ours is a proofof concept with both intrinsic and accidental advantages and limitations In-trinsic advantages and limitations pertain to the visualisation itself and ourdesign choices while accidental advantages and limitations concern our im-plementation During our experiment we collected the participantsrsquo feedbackabout our visualisation and now discuss both its intrinsic and accidental ad-vantages and limitations as reported by them We go back to some of thelimitations in the next section that describes threats to the validity of ourexperiment We also report feedback from three of the participants

641 Intrinsic Advantage

Visualisation of Debugging Paths Participants commended our visualisationfor presenting useful information related to the classes and methods followedby other developers during debugging In particular one participant reportedthat ldquo[i]t seems a fairly simple way to visualize classes and to demonstratehow they interactrdquo which comforts us in our choice of both the visualisationtechnique (graphs) and the data to display (developersrsquo debugging paths)

Effort in Debugging Three participants also mentioned that our visualisationshows where developers spent their debugging effort and where there are un-derstanding ldquobottlenecksrdquo In particular one participant wrote that our vi-sualisation ldquoallows the developer to skip several steps in debugging knowingfrom the graph where the problem probably comes fromrdquo

642 Intrinsic Limitations

Location One participant commented that ldquothe location where [an] issue oc-curs is not the same as the one that is responsible for the issuerdquo We are wellaware of this difference between the location where a fault occurs for exam-ple a null-pointer exception and the location of the source of the fault forexample a constructor where the field is not initialisedrdquo

However we build our visualisation on the premise that developers canshare their debugging activities for that particular reason by sharing theycould readily identify the source of a fault rather than only the location whereit occurs We plan to perform further studies to assess the usefulness of ourvisualisation to validate (or not) our premise

Scalability Several participants commented on the possible lack of scalabilityof our visualisation Graphs are well known to be not scalable so we areexpecting issues with larger graphs [34] Strategies to mitigate these issuesinclude graph sampling and clustering We plan to add these features in thenext release of our technique

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 31: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

Swarm Debugging the Collective Intelligence on Interactive Debugging 31

Presentation Several participants also commented on the (relative) lack ofinformation brought by the visualisation which is complementary to the lim-itation in scalability

One participant commented on the difference between the graph showingthe developersrsquo paths and the relative importance of classes during executionFuture work should seek to combine both information on the same graph pos-sibly by combining size and colours size could relate to the developersrsquo pathswhile colours could indicate the ldquoimportancerdquo of a class during execution

Evolution One participant commented that the graph is relevant for one ver-sion of the system but that as soon as some changes are performed by adeveloper the paths (or parts thereof) may become irrelevant

We agree with the participant and accept this limitation because our vi-sualisation is currently implemented for one version We will explore in futurework how to handle evolution by changing the graph as new versions are cre-ated

Trap One participant warned that our visualisation could lead developers intoa ldquotraprdquo if all developers whose paths are displayed followed the ldquowrongrdquopaths We agree with the participant but accept this limitation because devel-opers can always choose appropriate paths

Understanding One participant reported that the visualisation alone does notbring enough information to understand the task at hand We accept thislimitation because our visualisation is built to be complementary to otherviews available in the IDE

643 Accidental Advantages

Reducing Code Complexity One participant discussed the use of our visuali-sation to reduce code complexity for the developers by highlighting its mainfunctionalities

Complementing Differential Views Another participant contrasted our visu-alisation with Git Diff and mentioned that they complement each other wellbecause our visualisation ldquo[a]llows to quickly see where the problem probablyhas been before it got fixedrdquo while Git Diff allows seeing where the problemwas fixed

Highlighting Refactoring Opportunities A third participant suggested that thelarger node could represent classes that could be refactored if they also havemany faults to simplify future debugging sessions for developers

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 32: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

32Please give a shorter version with authorrunning and titlerunning prior to maketitle

644 Accidental Limitations

Presentation Several participants commented on the presentation of the infor-mation by our visualisation Most importantly they remarked that identifyingthe location of the fault was difficult because there was no distinction betweenfaulty and non-faulty classes In the future we will assess the use of iconsandndashor colours to identify faulty classesmethods

Others commented on the lack of captions describing the various visualelements Although this information was present in the tutorial and question-naires we will add it also into the visualisation possibly using tooltips

One participant added that more information such as ldquoexecution timemetrics [by] invocationsrdquo and ldquofailuresuccess rate [by] invocationsrdquo could bevaluable We plan to perform other controlled experiments with such addi-tional information to assess its impact on developersrsquo performance

Finally one participant mentioned that arrows would sometimes overlapwhich points to the need for a better layout algorithm for the graph in ourvisualisation However finding a good graph layout is a well-known difficultproblem

Navigation One participant commented that the visualisation does not helpdevelopers navigating between classes whose methods have low cohesion Itshould be possible to show in different parts of the graph the methods andtheir classes independently to avoid large nodes We plan to modify the graphvisualisation to have a ldquomethod-levelrdquo view whose nodes could be methodsandndashor clusters of methods (independently of their classes)

645 General Feedback

Three participants left general feedback regarding their experience with ourvisualisation under the question ldquoDescribe your debugging experiencerdquo Allthree participants provided positive comments We report herein one of thethree comments

It went pretty well In the beginning I was at a loss so just was lookingaround for some time Then I opened the breakpoints view for anothertask that was related to file parsing in the hope to find some hints Andindeed Irsquove found the BibtexParser class where the method with themost number of breakpoints was the one where I later found the faultHowever only this knowledge was not enough so I had to study thecode a bit Luckily it didnrsquot require too much effort to spot the problembecause all the related code was concentrated inside the parser classLuckily I had a BibTeX database at hand to use it for debugging It wasexcellent

This comment highlights the advantages of our approach and suggests thatour premise may be correct and that developers may benefit from one anotherrsquos

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 33: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

Swarm Debugging the Collective Intelligence on Interactive Debugging 33

debugging sessions It encourages us to pursue our research work in this di-rection and perform more experiments to point further ways of improving ourapproach

7 Discussion

We now discuss some implications of our work for Software Engineering re-searchers developers debuggersrsquo developers and educators SDI (and GV) isopen and freely available on-line28 and researchers can use them to performnew empirical studies about debugging activities

Developers can use SDI to record their debugging patterns toidentify debugging strategies that are more efficient in the context of theirprojects to improve their debugging skills

Developers can share their debugging activities such as breakpointsandndashor stepping paths to improve collaborative work and ease debuggingWhile developers usually work on specific tasks there are sometimes re-openissues andndashor similar tasks that need to understand or toggle breakpoints onthe same entity Thus using breakpoints previously toggled by a developercould help to assist another developer working on a similar task For instancethe breakpoint search tools can be used to retrieve breakpoints from previousdebugging sessions which could help speed up a new one providing devel-opers with valid starting points Therefore the breakpoint searching tool candecrease the time spent to toggle a new breakpoint

Developers of debuggers can use SDI to understand developersrsquodebugging habits to create new tools ndash using novel data-mining techniques ndashto integrate different data sources SDI provides a transparent framework fordevelopers to share debugging information creating a collective intelligenceabout their projects

Educators can leverage SDI to teach interactive debugging tech-niques tracing their studentsrsquo debugging sessions and evaluating their per-formance Data collected by SDI from debugging sessions performed by pro-fessional developers could also be used to educate students eg by showingthem examples of good and bad debugging patterns

There are locations (line of code class or method) on which there wereset many breakpoints in different tasks by different developers and this is anopportunity to recommend those locations as candidates for new debuggingsessions However we could face the bootstrapping problem we cannot knowthat these locations are important until developers start to put breakpointson them This problem could be addressed with time by using the infrastruc-ture to collect and share breakpoints accumulating data that can be used forfuture debugging sessions Further such incremental usefulness can encouragemore developers to collect and share breakpoints possibly leading to better-automated recommendations

28 httpgithubcomswarmdebugging

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 34: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

34Please give a shorter version with authorrunning and titlerunning prior to maketitle

We have answered what debugging information is useful to share among de-velopers to ease debugging with evidence that sharing debugging breakpointsand sessions can ease developersrsquo debugging activities Our study provides use-ful insights to researchers and tool developers on how to provide appropriatesupport during debugging activities in general they could support develop-ers by sharing other developersrsquo breakpoints and sessions They could alsodevelop recommender systems to help developers in deciding where to setbreakpointsand use this evidence to build a grounded theory on the setting ofbreakpoints and stepping by developers to improve debuggers and other toolsupport

8 Threats to Validity

Despite its promising results there exist threats to the validity of our studythat we discuss in this section

As any other empirical study ours is subject to limitations that threatenthe validity of its results The first limitation is related to the number ofparticipants we had With 7 participants we can not claim generalization ofthe results However we accept this limitation because the goal of the studywas to show the effectiveness of the data collected by the SDI to obtain insightsabout developersrsquo debugging activities Future studies with a more significantnumber of participants and more systems and tasks are needed to confirm theresults of the present research

Other threats to the validity of our study concern their internal externaland conclusion validity We accept these threats because the experimentalstudy aimed to show the effectiveness of the SDI to collect and share dataabout developersrsquo interactive debugging activities Future work is needed toperform in-depth experimental studies with these research questions and oth-ers possibly drawn from the ones that developers asked in another study bySillito et al [35]

Construct Validity Threats are related to the metrics used to answerour research questions We mainly used breakpoint locations which is a precisemeasure Moreover as we located breakpoints using our Swarm DebuggingInfrastructure (SDI) and visualisation any issue with this measure would affectour results To mitigate these threats we collected both SDI data and videocaptures of the participantsrsquo screens and compared the information extractedfrom the videos with the data collected by the SDI We observed that thebreakpoints collected by the SDI are exactly those toggled by the participants

We ask participants to self-report on their efforts during the tasks levels ofexperience etc through questionnaires Consequently it is possible that theanswer does not represent their real efforts levels etc We accept this threatbecause questionnaires are the best means to collect data about participantswithout incurring a high cost Construct validity could be improved in futurework by using instruments to measure effort independently for example butthis would lead to more time- and effort-consuming experiments

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 35: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

Swarm Debugging the Collective Intelligence on Interactive Debugging 35

Conclusion Validity Threats concern the relations found between inde-pendent and dependent variables In particular they concern the assumptionsof the statistical tests performed on the data and how diverse is the data Wedid not perform any statistical analysis to answer our research questions soour results do not depend on any statistical assumption

Internal Validity Threats are related to the tools used to collect thedata and the subject systems and if the collected data is sufficient to answerthe research questions We collected data using our visualisation We are wellaware that our visualisation does not scale for large systems but for JabRef itallowed participants to share paths during debugging and researchers to collectrelevant data including shared paths We plan to revise our visualisation inthe near future to identify possibilities to improve it so that it scales up tolarge systems

Each participant performed more than one task on the same system It ispossible that a participant may have become familiar with the system after ex-ecuting a task and would be knowledgeable enough to toggle breakpoints whenperforming the subsequent ones However we did not observe any significantdifference in performance when comparing the results for the same participantfor the first and last task Therefore we accept this threat but still plan for fu-ture studies with more tasks on more systems The participants probably wereaware of the fact that all faults were already solved in Github We controlledthis issue using the video recordings observing that all participants did notlook at the commit history during the experiment

External Validity Threats are about the possibility to generalise ourresults We use only one system in our Study 1 (JabRef) because we neededto have enough data points from a single system to assess the effectivenessof breakpoint prediction We should collect more data on other systems andcheck whether the system used can affect our results

9 Related work

We now summarise works related to debugging to allow better positioning ofour study among the published research

Program Understanding Previous work studied program comprehension andprovided tools to support program comprehension Maalej et al [36] observedand surveyed developers during program comprehension activities They con-cluded that developers need runtime information and reported that developersfrequently execute programs using a debugger Ko et al [37] observed that de-velopers spend large amounts of times navigating between program elements

Feature and fault location approaches are used to identify and recommendprogram elements that are relevant to a task at hand [38] These approachesuse defect report [39] domain knowledge [40] version history and defect reportsimilarity [38] while others like Mylyn [41] use developersrsquo interaction traces

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 36: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

36Please give a shorter version with authorrunning and titlerunning prior to maketitle

which have been used to study work interruption [42] editing patterns [4344] program exploration patterns [45] or copypaste behaviour [46]

Despite sharing similarities (tracing developer events in an IDE) our ap-proach differs from Mylynrsquos [41] First Mylynrsquos approach does not collect oruse any dynamic debugging information it is not designed to explore the dy-namic behaviours of developers during debugging sessions Second it is usefulin editing mode because it just filters files in an Eclipse view following a previ-ous context Our approach is for editing mode (finding breakpoints or visualizepaths) as during interactive debugging sessions Consequently our work andMylynrsquos are complementary and they should be used together during devel-opment sessions

Debugging Tools for Program Understanding Romero et al [47] extended thework by Katz and Anderson [48] and identified high-level debugging strategieseg stepping and breaking execution paths and inspecting variable valuesThey reported that developers use the information available in the debuggersdifferently depending on their background and level of expertise

DebugAdvisor [49] is a recommender system to improve debugging produc-tivity by automating the search for similar issues from the past

Zayour [20] studied the difficulties faced by developers when debuggingin IDEs and reported that the features of the IDE affect the times spent bydevelopers on debugging activities

Automated debugging tools Automated debugging tools require both success-ful and failed runs and do not support programs with interactive inputs [6]Consequently they have not been widely adopted in practice Moreover auto-mated debugging approaches are often unable to indicate the ldquotruerdquo locationsof faults [7] Other more interactive methods such as slicing and query lan-guages help developers but to date there has been no evidence that theysignificantly ease developersrsquo debugging activities

Recent studies showed that empirical evidence of the usefulness of manyautomated debugging techniques is limited [50] Researchers also found thatautomated debugging tools are rarely used in practice [50] At least in somescenarios the time to collect coverage information manually label the testcases as failing or passing and run the calculations may exceed the actualtime saved by using the automated debugging tools

Advanced Debugging Approaches Zheng et al [51] presented a systematic ap-proach to the statistical debugging of programs in the presence of multiplefaults using probability inference and common voting framework to accom-modate more general faults and predicate settings Ko and Myers [652] intro-duced interrogative debugging a process with which developers ask questionsabout their programs outputs to determine what parts of the programs tounderstand

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 37: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

Swarm Debugging the Collective Intelligence on Interactive Debugging 37

Pothier and Tanter [29] proposed Omniscient debuggers an approach tosupport back-in-time navigation across previous program states Delta debug-ging [53] by Hofer et al means that the smaller the failure-inducing inputthe less program code is covered It can be used to minimise a failure-inducinginput systematically Ressia [54] proposed object-centric debugging focusingon objects as the key abstraction execution for many tasks

Estler et al [55] discussed collaborative debugging suggesting that collab-oration in debugging activities is perceived as important by developers andcan improve their experience Our approach is consistent with this findingalthough we use asynchronous debugging sessions

Empirical Studies on Debugging Jiang et al [33] studied the change impactanalysis process that should be done during software maintenance by devel-opers to make sure changes do not introduce new faults They conducted twostudies about change impact analysis during debugging sessions They foundthat the programmers in their studies did static change impact analysis beforethey made changes by using IDE navigational functionalities They also diddynamic change impact analysis after they made changes by running the pro-grams In their study programmers did not use any change impact analysistools

Zhang et al [14] proposed a method to generate breakpoints based onexisting fault localization techniques showing that the generated breakpointscan usually save some human effort for debugging

10 Conclusion

Debugging is an important and challenging task in software maintenance re-quiring dedication and expertise However despite its importance developersrsquodebugging behaviors have not been extensively and comprehensively studiedIn this paper we introduced the concept of Swarm Debugging based on thefact that developers performing different debugging sessions build collectiveknowledge We asked what debugging information is useful to share amongdevelopers to ease debugging We particularly studied two pieces of debugginginformation breakpoints (and their locations) and sessions (debugging paths)because these pieces of information are related to the two main activities dur-ing debugging setting breakpoints and stepping inoverout statements

To evaluate the usefulness of Swarm Debugging and the sharing of de-bugging data we conducted two observational studies In the first study tounderstand how developers set breakpoints we collected and analyzed morethan 10 hours of developersrsquo videos in 45 debugging sessions performed by 28different independent developers containing 307 breakpoints on three soft-ware systems

The first study allowed us to draw four main conclusions At first settingthe first breakpoint is not an easy task and developers need tools to locatethe places where to toggle breakpoints Secondly the time of setting the first

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 38: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

38Please give a shorter version with authorrunning and titlerunning prior to maketitle

breakpoint is a predictor for the duration of a debugging task independentlyof the task Third developers choose breakpoints purposefully with an under-lying rationale because different developers set breakpoints on the same lineof code for the same task and also different developers toggle breakpointson the same classes or methods for different tasks showing the existence ofimportant ldquodebugging hot-spotsrdquo (ie regions in the code where there is moreincidence of debugging events) andndashor more error-prone classes and methodsFinally and surprisingly different independent developers set breakpoints atthe same locations for similar debugging tasks and thus collecting and sharingbreakpoints could assist developers during debugging task

Further we conducted a qualitative study with 23 professional developersand a controlled experiment with 13 professional developers collecting morethan 3 hours of developersrsquo debugging sessions From this second study weconcluded that (1) combining stepping paths in a graph visualisation from sev-eral debugging sessions produced elements to support developersrsquo hypothesesabout fault locations without looking at the code previously and (2) sharingprevious debugging sessions support debugging hypothesis and consequentlyreducing the effort on searching of code

Our results provide evidence that previous debugging sessions provide in-sights to and can be starting points for developers when building debugginghypotheses They showed that developers construct correct hypotheses on faultlocation when looking at graphs built from previous debugging sessions More-over they showed that developers can use past debugging sessions to identifystarting points for new debugging sessions Furthermore faults are recurrentand may be reopened sometime months later Sharing debugging sessions (asMylyn for editing sessions) is an approach to support debugging hypothesesand to support the reconstruction of the complex mental model processes in-volved in debugging However research work is in progress to corroborate theseresults

In future work we plan to build grounded theories on the use of breakpointsby developers We will use these theories to recommend breakpoints to otherdevelopers Developers need tools to locate adequate places to set breakpointsin their source code Our results suggest the opportunity for a breakpointrecommendation system similar to previous work [14] They could also formthe basis for building a grounded theory of the developersrsquo use of breakpointsto improve debuggers and other tool support

Moreover we also suggest that debugging tasks could be divided intotwo activities one of locating bugs which could benefit from the col-lective intelligence of other developers and could be performed by dedicatedldquohuntersrdquo and another one of fixing the faults which requires deep un-derstanding of the program its design its architecture and the consequencesof changes This latter activity could be performed by dedicated ldquobuildersrdquoHence actionable results include recommender systems and a change of paradigmin the debugging of software programs

Last but not least the research community can leverage the SDI to con-duct more studies to improve our understanding of developersrsquo debugging be-

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 39: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

Swarm Debugging the Collective Intelligence on Interactive Debugging 39

haviour which could ultimately result into the development of whole newfamilies of debugging tools that are more efficient andndashor more adapted tothe particularity of debugging Many open questions remain and this paper isjust a first step towards fully understanding how collective intelligence couldimprove debugging activities

Our vision is that IDEs should incorporate a general framework to captureand exploit IDE interactions creating an ecosystem of context-aware appli-cations and plugins Swarm Debugging is the first step towards intelligentdebuggers and IDEs context-aware programs that monitor and reason abouthow developers interact with them providing for crowd software-engineering

11 Acknowledgment

This work has been partially supported by the Natural Sciences and Engi-neering Research Council of Canada (NSERC) the Brazilian research fundingagencies CNPq (National Council for Scientific and Technological Develop-ment) and CAPES Foundation (Finance Code 001) We also acknowledgeall the participants in our experiments and the insightful comments from theanonymous reviewers

References

1 AS Tanenbaum WH Benson Software Practice and Experience 3(2) 109 (1973)DOI 101002spe4380030204

2 H Katso in Unix Programmerrsquos Manual (Bell Telephone Laboratories Inc 1979) pNA

3 MA Linton in Proceedings of the Summer USENIX Conference (1990) pp 211ndash2204 R Stallman S Shebs Debugging with GDB - The GNU Source-Level Debugger (GNU

Press 2002)5 P Wainwright GNU DDD - Data Display Debugger (2010)6 A Ko Proceeding of the 28th international conference on Software engineering - ICSE

rsquo06 p 989 (2006) DOI 101145113428511344717 J Roszligler in 2012 1st International Workshop on User Evaluation for Software Engi-

neering Researchers USER 2012 - Proceedings (2012) pp 13ndash16 DOI 101109USER20126226573

8 TD LaToza Ba Myers 2010 ACMIEEE 32nd International Conference on SoftwareEngineering 1 185 (2010) DOI 10114518067991806829

9 AJ Ko HH Aung BA Myers in Proceedings 27th International Conference onSoftware Engineering 2005 ICSE 2005 (2005) pp 126ndash135 DOI 101109ICSE20051553555

10 TD LaToza G Venolia R DeLine in ICSE (2006) pp 492ndash50111 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software Engi-

neering and Methodology 23(4) 1 (2014) DOI 101145262266912 MA Storey L Singer B Cleary F Figueira Filho A Zagalsky in Proceedings of the

on Future of Software Engineering - FOSE 2014 (ACM Press New York New YorkUSA 2014) pp 100ndash116 DOI 10114525938822593887

13 M Beller N Spruit A Zaidman How developers debug (2017) URL httpsdoi

org107287peerjpreprints2743v1

14 C Zhang J Yang D Yan S Yang Y Chen Journal of Software 86(3) 603 (2013)DOI 104304jsw83603-616

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 40: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

40Please give a shorter version with authorrunning and titlerunning prior to maketitle

15 R Tiarks T Rohm Softwaretechnik-Trends 32(2) 19 (2013) DOI 101007BF03323460 URL httplinkspringercom101007BF03323460

16 F Petrillo G Lacerda M Pimenta C Freitas in 2015 IEEE 3rd Working Con-ference on Software Visualization (VISSOFT) (IEEE 2015) pp 140ndash144 DOI101109VISSOFT20157332425

17 F Petrillo Z Soh F Khomh M Pimenta C Freitas YG Gueheneuc in In Proceed-ings of the 2016 IEEE International Conference on Software Quality Reliability andSecurity (QRS) (2016) p 10

18 F Petrillo H Mandian A Yamashita F Khomh YG Gueheneuc in 2017 IEEEInternational Conference on Software Quality Reliability and Security (QRS) (2017)pp 285ndash295 DOI 101109QRS201739

19 K Araki Z Furukawa J Cheng Software IEEE 8(3) 14 (1991) DOI 101109528893920 I Zayour A Hamdar Information and Software Technology 70 130 (2016) DOI

101016jinfsof20151001021 R Chern K De Volder in Proceedings of the 6th International Conference on Aspect-

oriented Software Development (ACM New York NY USA 2007) AOSD rsquo07 pp96ndash106 DOI 10114512185631218575 URL httpdoiacmorg1011451218563

1218575

22 Eclipse Managing conditional breakpoints URL httphelpeclipseorg

neonindexjsptopic=2Forgeclipsejdtdocuser2Ftasks2Ftask-manage_

conditional_breakpointhtmampcp=1_3_6_0_5

23 S Garnier J Gautrais G Theraulaz Swarm Intelligence 1(1) 3 (2007) DOI 101007s11721-007-0004-y

24 WR Tschinkel Journal of Bioeconomics 17(3) 271 (2015) DOI 101007s10818-015-9203-6 URL httpdxdoiorg101007s10818-015-9203-6http

linkspringercom101007s10818-015-9203-6

25 T Ball S Eick Computer 29(4) 33 (1996) DOI 101109248829926 A Cockburn Agile Software Development The Cooperative Game Second Edition

(Addison-Wesley Professional 2006)27 J Lawrance C Bogart M Burnett R Bellamy K Rector SD Fleming IEEE Trans-

actions on Software Engineering 39(2) 197 (2013) DOI 101109TSE201011128 D Piorkowski SD Fleming C Scaffidi M Burnett I Kwan AZ Henley J Macbeth

C Hill A Horvath in 2015 IEEE International Conference on Software Maintenanceand Evolution (ICSME) (2015) pp 11ndash20 DOI 101109ICSM20157332447

29 G Pothier E Tanter IEEE Software 26(6) 78 (2009) DOI 101109MS2009169URL httpieeexploreieeeorglpdocsepic03wrapperhtmarnumber=5287015

30 M Beller N Spruit D Spinellis A Zaidman in 40th International Conference on Soware Engineering ICSE (2018) pp 572ndash583

31 D Grove G DeFouw J Dean C Chambers Proceedings of the 12th ACM SIG-PLAN conference on Object-oriented programming systems languages and appli-cations - OOPSLA rsquo97 pp 108ndash124 (1997) DOI 101145263698264352 URLhttpportalacmorgcitationcfmdoid=263698264352

32 R Saito ME Smoot K Ono J Ruscheinski Pl Wang S Lotia AR Pico GDBader T Ideker Nature methods 9(11) 1069 (2012) DOI 101038nmeth2212URL httpwwwpubmedcentralnihgovarticlerenderfcgiartid=3649846amptool=

pmcentrezamprendertype=abstract

33 S Jiang C McMillan R Santelices Empirical Software Engineering pp 1ndash39 (2016)DOI 101007s10664-016-9441-9

34 R Pienta J Abello M Kahng DH Chau in 2015 International Conference on BigData and Smart Computing (BIGCOMP) (IEEE 2015) pp 271ndash278 DOI 10110935021BIGCOMP20157072812

35 J Sillito GC Murphy KD Volder IEEE Transactions on Software Engineering 34(4)434 (2008)

36 W Maalej R Tiarks T Roehm R Koschke ACM Transactions on Software En-gineering and Methodology 23(4) 311 (2014) DOI 1011452622669 URL http

doiacmorg1011452622669

37 AJ Ko BA Myers MJ Coblenz HH Aung IEEE Transaction on Software Engi-neering 32(12) 971 (2006)

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 41: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

Swarm Debugging the Collective Intelligence on Interactive Debugging 41

38 S Wang D Lo in Proceedings of the 22nd International Conference on Program Com-prehension - ICPC 2014 (ACM Press New York New York USA 2014) pp 53ndash63DOI 10114525970082597148

39 J Zhou H Zhang D Lo in 2012 34th International Conference on Software Engi-neering (ICSE) (IEEE 2012) pp 14ndash24 DOI 101109ICSE20126227210

40 X Ye R Bunescu C Liu in Proceedings of the 22nd ACM SIGSOFT InternationalSymposium on Foundations of Software Engineering - FSE 2014 (ACM Press NewYork New York USA 2014) pp 689ndash699 DOI 10114526358682635874

41 M Kersten GC Murphy in Proceedings of the 14th ACM SIGSOFT internationalsymposium on Foundations of software engineering (2006) pp 1ndash11

42 H Sanchez R Robbes VM Gonzalez in Software Analysis Evolution and Reengi-neering (SANER) 2015 IEEE 22nd International Conference on (2015) pp 251ndash260

43 A Ying M Robillard in Proceedings International Conference on Program Compre-hension (2011) pp 31ndash40

44 F Zhang F Khomh Y Zou AE Hassan in Proceedings Working Conference onReverse Engineering (2012) pp 456ndash465

45 Z Soh F Khomh YG Gueheneuc G Antoniol B Adams in Reverse Engineering(WCRE) 2013 20th Working Conference on (2013) pp 391ndash400 DOI 101109WCRE20136671314

46 TM Ahmed W Shang AE Hassan in Mining Software Repositories (MSR) 2015IEEEACM 12th Working Conference on (2015) pp 99ndash110 DOI 101109MSR201517

47 P Romero B du Boulay R Cox R Lutz S Bryant International Journal of Human-Computer Studies 65(12) 992 (2007) DOI 101016jijhcs200707005

48 I Katz J Anderson Human-Computer Interaction 3(4) 351 (1987) DOI 101207s15327051hci0304 2

49 B Ashok J Joy H Liang SK Rajamani G Srinivasa V Vangala in Proceedings ofthe 7th joint meeting of the European software engineering conference and the ACMSIGSOFT symposium on The foundations of software engineering on European soft-ware engineering conference and foundations of software engineering symposium - E(ACM Press New York New York USA 2009) p 373 DOI 10114515956961595766

50 C Parnin A Orso Proceedings of the 2011 International Symposium on SoftwareTesting and Analysis ISSTA 11 p 199 (2011) DOI 10114520014202001445

51 AX Zheng MI Jordan B Liblit M Naik A Aiken Challenges 148 1105 (2006)DOI 10114511438441143983

52 A Ko BA Myers in CHI 2009 - Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems ed by ACM (New York New York USA 2009) pp1569ndash1578 DOI 10114515187011518942

53 B Hofer F Wotawa Advances in Software Engineering 2012 13 (2012) DOI 1011552012628571

54 J Ressia A Bergel O Nierstrasz in Proceedings - International Conference on Soft-ware Engineering (2012) pp 485ndash495 DOI 101109ICSE20126227167

55 HC Estler M Nordio Ca Furia B Meyer Proceedings - IEEE 8th InternationalConference on Global Software Engineering ICGSE 2013 pp 110ndash119 (2013) DOI101109ICGSE201321

56 S Demeyer S Ducasse M Lanza in Proceedings of the Sixth Working Conference onReverse Engineering (IEEE Computer Society Washington DC USA 1999) WCRErsquo99 pp 175ndash URL httpdlacmorgcitationcfmid=832306837044

57 D Piorkowski S Fleming in Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems CHI rsquo13 (ACM Paris France 2013) pp 3063mdash-3072 DOI10114524664162466418 URL httpdlacmorgcitationcfmid=2466418

58 SD Fleming C Scaffidi D Piorkowski M Burnett R Bellamy J Lawrance I KwanACM Transactions on Software Engineering and Methodology 22(2) 1 (2013) DOI10114524305452430551

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 42: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

42Please give a shorter version with authorrunning and titlerunning prior to maketitle

Appendix - Implementation of Swarm Debugging

Swarm Debugging Services

The Swarm Debugging Services (SDS) provide the infrastructure needed bythe Swarm Debugging Tracer (SDT) to store and later share debugging datafrom and between developers Figure 13 shows the architecture of this in-frastructure The SDT sends RESTful messages that are received by a SDSinstance that stores them in three specialized persistence mechanisms an SQLdatabase (PostgreSQL) a full-text search engine (ElasticSearch) and a graphdatabase (Neo4J)

Fig 13 The Swarm Debugging Services architecture

The three persistence mechanisms use similar sets of concepts to define thesemantics of the SDT messages

We choose and define domain concepts to model software projects anddebugging data Figure 14 shows the meta-model of these concepts using anentity-relationship representation The concepts are inspired by the FAMIXData model [56] The concepts include

ndash Developer is a SDT user She creates and executes debugging sessionsndash Product is the target software product A product is a set of Eclipse

projects (1 or more)ndash Task is the task to be executed by developersndash Session represents a Swarm Debugging session It relates developer project

and debugging events

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 43: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

Swarm Debugging the Collective Intelligence on Interactive Debugging 43

Fig 14 The Swarm Debugging metadata [17]

ndash Type represents classes and interfaces in the project Each type has asource code and a file SDS only considers types that have source codeavailable as belonging to the project domain

ndash Method is a method associated with a type which can be invoked duringdebugging sessions

ndash Namespace is a container for types In Java namespaces are declaredwith the keyword package

ndash Invocation is a method invoked from another method (or from the JVMin case of the main method)

ndash Breakpoint represents the data collected when a developer toggles abreakpoint in the Eclipse IDE Each breakpoint is associated with a typeand a method if appropriate

ndash Event is an event data that is collected when a developer performs someactions during a debugging session

The SDS provides several services for manipulating querying and search-ing collected data (1) Swarm RESTful API (2) SQL query console (3) full-text search API (4) dashboard service and (5) graph querying console

Swarm RESTful API The SDS provides a RESTful API to manipulate de-bugging data using the Spring Boot framework29 Create retrieve update

29 httpprojectsspringiospring-boot

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 44: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

44Please give a shorter version with authorrunning and titlerunning prior to maketitle

and delete operations are available through HTTP requests and respond witha JSON structure For example upon submitting the HTTP request

httpswarmdebuggingorgdevelopers

searchfindByNamename=petrillo

the SDS responds with a list of developers whose names are ldquopetrillordquo inJSON format

SQL Query Console The SDS provides a console30 to receive SQL queries(SQL) on the debugging data providing relational aggregations and functions

Full-text Search Engine The SDS also provides an ElasticSearch31 which isa highly scalable open-source full-text search and analytic engine to storesearch and analyse the debugging data The SDS instantiates an instance ofthe ElasticSearch engine and offers a console for executing complex queries onthe debugging data

Dashboard Service The ElasticSearch allows the use of the Kibana dashboardThe SDS exposes a Kibana instance on the debugging data With the dash-board researchers can build charts describing the data Figure 15 shows aSwarm Dashboard embedded into Eclipse as a view

Fig 15 Swarm Debugging Dashboard

30 httpdbswarmdebuggingorg31 httpswwwelasticco

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 45: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

Swarm Debugging the Collective Intelligence on Interactive Debugging 45

Fig 16 Neo4J Browser - a Cypher query example

Graph Querying Console The SDS also persists debugging data in a Neo4J32

graph database Neo4J provides a query language named Cypher which is adeclarative SQL-inspired language for describing patterns in graphs It allowsresearchers to express what they want to select insert update or delete froma graph database without describing precisely how to do it The SDS exposesthe Neo4J Browser and creates an Eclipse view

Figure 16 shows an example of Cypher query and the resulting graph

Swarm Debugging Tracer

Swarm Debugging Tracer (SDT) is an Eclipse plug-in that listens to debug-ger events during debugging sessions extending the Java Platform DebuggingArchitecture (JDPA) Using the Eclipse JPDA events are listened by our De-bugTracer that implements two listenersIDebugEventSetListener and IBreakpointListener Figure 17 shows theSDT architecture

After an authentication process developers create a debugging session us-ing the Swarm Manager view and toggle breakpoints trigger stepping eventsas Step Into Step Over or Step Return These events are caught and stacktrace items are analyzed by the Tracer extracting method invocations

To use the SDT a developer must open the view ldquoSwarm Managerrdquo and es-tablish a connection with the Swarm Debugging Services If the target projectis not into the Swarm Manager she can associate any project in her work-space into Swarm Manager (as shown in Figure 18) This association consistsof linking a Swarm Session with a project in the Eclipse workspace Secondshe must create a Swarm session Once a session is established she can useany feature of the regular Eclipse debugger the SDT collects developersrsquo in-teraction events in the background with no visible performance decrease

32 httpneo4jcom

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 46: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

46Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 17 The Swarm Tracer architecture [17]

Fig 18 The Swarm Manager view

Typically the developer will toggle some breakpoints to stop the executionof the program of interest at locations deemed relevant to fix the fault at handThe SDT collects the data associated to these breakpoints (locations condi-tions and so on) After toggling breakpoints the developer runs the programin debug mode The program stops at the first reached breakpoint Conse-quently for each event such as Step Into or Breakpoint the SDT capturesthe event and related data It also stores data about methods called storing

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 47: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

Swarm Debugging the Collective Intelligence on Interactive Debugging 47

Fig 19 Breakpoint search tool (fuzzy search example)

invocations entry for each pair invokinginvoked method Following the forag-ing approach [57] the SDT only collects invokinginvoked methods that werevisited by the developer during the debugging session ignoring other invoca-tions The debugging activity continues until the program run finishes TheSwarm session is then completed

To avoid performance and memory issues the SDT collects and sends thedata using a set of specialised DomainServices that send RESTful messagesto a SwarmRestFacade connecting to the Swarm Debugging Services

Swarm Debugging Views

On top of the SDS the SDI implements and proposes several tools to searchand visualise the data collected during debugging sessions These tools areintegrated in the Eclipse IDE simplifying their usage They include but arenot limited to the followings

Sequence Stack Diagrams Sequence stack diagrams are novel diagrams [16] torepresent sequences of method invocations as shown by Figure 20 They usecircles to represent methods and arrows to represent invocations Each line isa complete stack trace without returns The first node is a starting method(non-invoked method) and the last node is an ending method (non-invokingmethod) If an invocation chain contains a non-starting method a new line iscreated and the actual stack is repeated and a dotted arrow is used to representa return for this node as illustrated by the method Circledraw in Figure 20In addition developers can directly go to a method in the Eclipse Editor bydouble-clicking over a node in the diagram

Dynamic Method Call Graphs They are direct call graphs [31] as shown inFigure 21 to display the hierarchical relations between invoked methods Theyuse circles to represent methods and oriented arrows to express invocationsEach session generates a graph and all invocations collected during the sessionare shown on these graphs The starting points (non-invoked methods) areallocated on top of a tree and adjacent nodes represent invocations sequences

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 48: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

48Please give a shorter version with authorrunning and titlerunning prior to maketitle

Fig 20 Sequence stack diagram for Bridge design pattern

Researchers can navigate sequences of invocation methods pressing the F9(forward) and F10 (backward) keys They can also directly go to a method inthe Eclipse Editor by double-clicking on nodes in the graphs

Breakpoint Search Tool

Researchers and developers can use this tool to find suitable breakpoints [58]when working with the debugger For each breakpoint the SDS captures thetype and location in the type where the breakpoint was toggled Thus de-velopers can share their breakpoints The breakpoint search tool allows fuzzymatch and wildcard ElasticSearch queries Results are displayed in the SearchView table for easy selection Developers can also open a type directly in theEclipse Editor by double-clicking on a selected breakpoint

Figure 19 shows an example of breakpoint search in which the search boxcontains the misspelled word fcatory

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 49: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

Swarm Debugging the Collective Intelligence on Interactive Debugging 49

Fig 21 Method call graph for Bridge design pattern [17]

StartingEnding Method Search Tool

This tool allows searching for methods that (1) only invoke other methods butthat are not explicitly invoked themselves during the debugging session and(2) that are only invoked by others but that do not invoke other methods

Formally we define StartingEnding methods as follows Given a graphG = (VE) where V is a set of vertexes V = V1 V2 Vn and E is a setof edges E = (V1 V2) (V 1 V 3) Then each edge is formed by a pairlt Vi Vj gt were Vi is the invoking method and Vj is the invoked methodIf α is the subset of all vertexes invoking methods and β is the subset of allvertexes invoked by methods then the Starting and Ending methods are

StartingPoint = VSP | VSP isin α and VSP isin β

EndingPoint = VEP | VEP isin β and VEP isin α

Locating these methods is important in a debugging session because theyare the entries and exits points of a program at runtime

Summary

Through the SDI we provide a technique and model to collect store and shareinteractive debugging session data contextualizing breakpoints and eventsduring these sessions We created real-time and interactive visualizations usingweb technologies providing an automatic memory for developer explorationsMoreover dividing software exploration by sessions and its call graphs areeasy to understand because only intentional visited areas are shown on these

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment
Page 50: arXiv:1902.03520v1 [cs.SE] 10 Feb 2019 · 2019-02-12 · Swarm Debugging: the Collective Intelligence on Interactive Debugging 3 this information as context for their debugging. Thus,

50Please give a shorter version with authorrunning and titlerunning prior to maketitle

graphs one can go through the execution of a project and see only the impor-tant areas that are relevant to developers

Currently the Swarm Tracer is implemented in Java using Eclipse DebugCore services However SDI provides a RESTful API that can be accessedindependently and new tracers can be implemented for different IDEs or de-buggers

  • 1 Introduction
  • 2 Background
  • 3 The Swarm Debugging Approach
  • 4 SDI in a Nutshell
  • 5 Using SDI to Understand Debugging Activities
  • 6 Evaluation of Swarm Debugging using GV
  • 7 Discussion
  • 8 Threats to Validity
  • 9 Related work
  • 10 Conclusion
  • 11 Acknowledgment