supporting newcomers in open source software development projects

Post on 04-Dec-2014

220 Views

Category:

Presentations & Public Speaking

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

PhD Dissertation by Sebastiano Panichella Title Thesis: "Supporting Newcomers in Software Development Projects" Advisors: Massimiliano Di Penta and Gerardo Canfora

TRANSCRIPT

1

Supporting Newcomers in SoftwareDevelopment Projects

Doctoral Dissertationby

Sebastiano Panichella

Under the supervision of

Prof Massimiliano Di PentaProf Gerardo Canfora

July 2014

2

Newcomer Learning Pathhellip

3

Newcomer Learning Pathhellip

4

Newcomer Learning Pathhellip

5

Newcomer Learning Pathhellip

6

Data Extraction

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Newcomer Learning Pathhellip

7

Data Extraction

Empirical Studies

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Newcomer Learning Pathhellip

8

Data Extraction

Recommenders

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Empirical Studies

Newcomer Learning Pathhellip

9

Data Extraction

Recommenders

Studies

Newcomer Training Process

10

Data Extraction

Recommenders

Mentoring

Studies

1) Recommend Mentors

Newcomer Training Process

11

Data Extraction

Recommenders1) Recommend Mentors

Studies

2) Supporting Source Code Comprehension and Re-documentation

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries

Perform Development Tasks Mentoring

Newcomer Training Process

12

1) Recommend Mentors

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries3) Analyze developers c) social activity d) technical activity

Newcomer Training Process

13

1) Recommend Mentors

3) Analyze developers c) social activity d) technical activity

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) generate source code summaries d) support maintenance tasks

Newcomer Training Process

14

Thesis Structure

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

15

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

2

Newcomer Learning Pathhellip

3

Newcomer Learning Pathhellip

4

Newcomer Learning Pathhellip

5

Newcomer Learning Pathhellip

6

Data Extraction

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Newcomer Learning Pathhellip

7

Data Extraction

Empirical Studies

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Newcomer Learning Pathhellip

8

Data Extraction

Recommenders

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Empirical Studies

Newcomer Learning Pathhellip

9

Data Extraction

Recommenders

Studies

Newcomer Training Process

10

Data Extraction

Recommenders

Mentoring

Studies

1) Recommend Mentors

Newcomer Training Process

11

Data Extraction

Recommenders1) Recommend Mentors

Studies

2) Supporting Source Code Comprehension and Re-documentation

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries

Perform Development Tasks Mentoring

Newcomer Training Process

12

1) Recommend Mentors

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries3) Analyze developers c) social activity d) technical activity

Newcomer Training Process

13

1) Recommend Mentors

3) Analyze developers c) social activity d) technical activity

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) generate source code summaries d) support maintenance tasks

Newcomer Training Process

14

Thesis Structure

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

15

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

3

Newcomer Learning Pathhellip

4

Newcomer Learning Pathhellip

5

Newcomer Learning Pathhellip

6

Data Extraction

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Newcomer Learning Pathhellip

7

Data Extraction

Empirical Studies

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Newcomer Learning Pathhellip

8

Data Extraction

Recommenders

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Empirical Studies

Newcomer Learning Pathhellip

9

Data Extraction

Recommenders

Studies

Newcomer Training Process

10

Data Extraction

Recommenders

Mentoring

Studies

1) Recommend Mentors

Newcomer Training Process

11

Data Extraction

Recommenders1) Recommend Mentors

Studies

2) Supporting Source Code Comprehension and Re-documentation

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries

Perform Development Tasks Mentoring

Newcomer Training Process

12

1) Recommend Mentors

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries3) Analyze developers c) social activity d) technical activity

Newcomer Training Process

13

1) Recommend Mentors

3) Analyze developers c) social activity d) technical activity

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) generate source code summaries d) support maintenance tasks

Newcomer Training Process

14

Thesis Structure

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

15

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

4

Newcomer Learning Pathhellip

5

Newcomer Learning Pathhellip

6

Data Extraction

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Newcomer Learning Pathhellip

7

Data Extraction

Empirical Studies

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Newcomer Learning Pathhellip

8

Data Extraction

Recommenders

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Empirical Studies

Newcomer Learning Pathhellip

9

Data Extraction

Recommenders

Studies

Newcomer Training Process

10

Data Extraction

Recommenders

Mentoring

Studies

1) Recommend Mentors

Newcomer Training Process

11

Data Extraction

Recommenders1) Recommend Mentors

Studies

2) Supporting Source Code Comprehension and Re-documentation

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries

Perform Development Tasks Mentoring

Newcomer Training Process

12

1) Recommend Mentors

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries3) Analyze developers c) social activity d) technical activity

Newcomer Training Process

13

1) Recommend Mentors

3) Analyze developers c) social activity d) technical activity

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) generate source code summaries d) support maintenance tasks

Newcomer Training Process

14

Thesis Structure

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

15

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

5

Newcomer Learning Pathhellip

6

Data Extraction

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Newcomer Learning Pathhellip

7

Data Extraction

Empirical Studies

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Newcomer Learning Pathhellip

8

Data Extraction

Recommenders

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Empirical Studies

Newcomer Learning Pathhellip

9

Data Extraction

Recommenders

Studies

Newcomer Training Process

10

Data Extraction

Recommenders

Mentoring

Studies

1) Recommend Mentors

Newcomer Training Process

11

Data Extraction

Recommenders1) Recommend Mentors

Studies

2) Supporting Source Code Comprehension and Re-documentation

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries

Perform Development Tasks Mentoring

Newcomer Training Process

12

1) Recommend Mentors

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries3) Analyze developers c) social activity d) technical activity

Newcomer Training Process

13

1) Recommend Mentors

3) Analyze developers c) social activity d) technical activity

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) generate source code summaries d) support maintenance tasks

Newcomer Training Process

14

Thesis Structure

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

15

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

6

Data Extraction

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Newcomer Learning Pathhellip

7

Data Extraction

Empirical Studies

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Newcomer Learning Pathhellip

8

Data Extraction

Recommenders

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Empirical Studies

Newcomer Learning Pathhellip

9

Data Extraction

Recommenders

Studies

Newcomer Training Process

10

Data Extraction

Recommenders

Mentoring

Studies

1) Recommend Mentors

Newcomer Training Process

11

Data Extraction

Recommenders1) Recommend Mentors

Studies

2) Supporting Source Code Comprehension and Re-documentation

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries

Perform Development Tasks Mentoring

Newcomer Training Process

12

1) Recommend Mentors

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries3) Analyze developers c) social activity d) technical activity

Newcomer Training Process

13

1) Recommend Mentors

3) Analyze developers c) social activity d) technical activity

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) generate source code summaries d) support maintenance tasks

Newcomer Training Process

14

Thesis Structure

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

15

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

7

Data Extraction

Empirical Studies

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Newcomer Learning Pathhellip

8

Data Extraction

Recommenders

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Empirical Studies

Newcomer Learning Pathhellip

9

Data Extraction

Recommenders

Studies

Newcomer Training Process

10

Data Extraction

Recommenders

Mentoring

Studies

1) Recommend Mentors

Newcomer Training Process

11

Data Extraction

Recommenders1) Recommend Mentors

Studies

2) Supporting Source Code Comprehension and Re-documentation

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries

Perform Development Tasks Mentoring

Newcomer Training Process

12

1) Recommend Mentors

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries3) Analyze developers c) social activity d) technical activity

Newcomer Training Process

13

1) Recommend Mentors

3) Analyze developers c) social activity d) technical activity

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) generate source code summaries d) support maintenance tasks

Newcomer Training Process

14

Thesis Structure

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

15

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

8

Data Extraction

Recommenders

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Empirical Studies

Newcomer Learning Pathhellip

9

Data Extraction

Recommenders

Studies

Newcomer Training Process

10

Data Extraction

Recommenders

Mentoring

Studies

1) Recommend Mentors

Newcomer Training Process

11

Data Extraction

Recommenders1) Recommend Mentors

Studies

2) Supporting Source Code Comprehension and Re-documentation

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries

Perform Development Tasks Mentoring

Newcomer Training Process

12

1) Recommend Mentors

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries3) Analyze developers c) social activity d) technical activity

Newcomer Training Process

13

1) Recommend Mentors

3) Analyze developers c) social activity d) technical activity

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) generate source code summaries d) support maintenance tasks

Newcomer Training Process

14

Thesis Structure

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

15

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

9

Data Extraction

Recommenders

Studies

Newcomer Training Process

10

Data Extraction

Recommenders

Mentoring

Studies

1) Recommend Mentors

Newcomer Training Process

11

Data Extraction

Recommenders1) Recommend Mentors

Studies

2) Supporting Source Code Comprehension and Re-documentation

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries

Perform Development Tasks Mentoring

Newcomer Training Process

12

1) Recommend Mentors

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries3) Analyze developers c) social activity d) technical activity

Newcomer Training Process

13

1) Recommend Mentors

3) Analyze developers c) social activity d) technical activity

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) generate source code summaries d) support maintenance tasks

Newcomer Training Process

14

Thesis Structure

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

15

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

10

Data Extraction

Recommenders

Mentoring

Studies

1) Recommend Mentors

Newcomer Training Process

11

Data Extraction

Recommenders1) Recommend Mentors

Studies

2) Supporting Source Code Comprehension and Re-documentation

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries

Perform Development Tasks Mentoring

Newcomer Training Process

12

1) Recommend Mentors

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries3) Analyze developers c) social activity d) technical activity

Newcomer Training Process

13

1) Recommend Mentors

3) Analyze developers c) social activity d) technical activity

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) generate source code summaries d) support maintenance tasks

Newcomer Training Process

14

Thesis Structure

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

15

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

11

Data Extraction

Recommenders1) Recommend Mentors

Studies

2) Supporting Source Code Comprehension and Re-documentation

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries

Perform Development Tasks Mentoring

Newcomer Training Process

12

1) Recommend Mentors

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries3) Analyze developers c) social activity d) technical activity

Newcomer Training Process

13

1) Recommend Mentors

3) Analyze developers c) social activity d) technical activity

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) generate source code summaries d) support maintenance tasks

Newcomer Training Process

14

Thesis Structure

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

15

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

12

1) Recommend Mentors

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries3) Analyze developers c) social activity d) technical activity

Newcomer Training Process

13

1) Recommend Mentors

3) Analyze developers c) social activity d) technical activity

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) generate source code summaries d) support maintenance tasks

Newcomer Training Process

14

Thesis Structure

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

15

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

13

1) Recommend Mentors

3) Analyze developers c) social activity d) technical activity

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) generate source code summaries d) support maintenance tasks

Newcomer Training Process

14

Thesis Structure

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

15

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

14

Thesis Structure

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

15

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

15

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161

top related