mining stack overflow to tun the ide into a self-confident programming prompter
TRANSCRIPT
Mining StackOverflow to Turn the IDE into a Self-confident Programming Prompter
Luca Ponzanelli, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, and Michele Lanza
http://prompter.inf.usi.ch
The Lone Developer
Collaborating People
Pair Programming
Pair Programming
Online Resources
Recommender Systems for Software Engineering
M. P. Robillard, R. J. Walker, and T. Zimmerman Recommender systems for software engineering IEEE Software, 2010
“RSSEs are software applications that provide information estimated to be valuable for a software engineering task in a given context”
No Spontaneous Recommendation
No Spontaneous Recommendation
No Self-Confidence
Pair Programming
ThePrompter
ProgrammingPrompter
●
NP P
2040
6080
100
Treatment
Com
pleteness
Development Task
NP = Without PrompterP = With Prompter
Prompter is effective in development tasks
Prompter
Eclipse
Query Generation Service
Prompter
Eclipse
Code Context
Query Generation Service
Prompter
Eclipse
Code Context
org.tartarus.snowball.SnowballStemmerorg.tartarus.snowball.ext.englishStemmerjava.util.Listjava.util.ArrayListString
API Types
setCurrent, getCurrent, stem, addAPI Method Names
@Overridepublic List<String> filter(final List<String> tokens) {
final List<String> stemmed = new ArrayList<String>();for(final String t : tokens){
SnowballStemmer stemmer = new englishStemmer();stemmer.setCurrent(t);stemmer.stem();stemmed.add(stemmer.getCurrent());
}return stemmed;
}
Entity Code
Query Generation Service
Prompter
Eclipse
Query
Query Generation Service
Prompter
Eclipse
Query
@Overridepublic List<String> filter(final List<String> tokens) {
final List<String> stemmed = new ArrayList<String>();for(final String t : tokens){
SnowballStemmer stemmer = new englishStemmer();stemmer.setCurrent(t);stemmer.stem();stemmed.add(stemmer.getCurrent());
}return stemmed;
}
Term Frequency Entropy Frequency * (1- Entropy)
stemmer 6 0.15 5.1
stemmed 3 0.15 2.55
tokens 2 0.45 1.1
list 4 0.74 1.04
snowball 1 0.11 0.89
stem 1 0.25 0.75
english 1 0.51 0.49
filter 1 0.58 0.42
array 1 0.72 0.28
set 1 0.8 0.2
add 1 0.84 0.16
Search Engines Proxy
Ranking Model
Query Generation Service
Search Service
Prompter
Eclipse
Query
Code Context
Bing
Blekko
Search Engines
Query
Ranking Model
Search Engines Proxy
Search Service
Code Context
Query Generation Service
Prompter
Eclipse
Results
Search Engines Proxy
Ranking ModelPrompter
Query Generation Service
Search Service
Eclipse
Discussions IDs
Stack Overflow API Service
Bing
Blekko
Search Engines
Stack Overflow API Service
Ranking Model
Search Engines Proxy
Prompter
Query Generation Service
Search Service
Eclipse
Documents
Bing
Blekko
Search Engines
Stack Overflow API Service
Search Engines Proxy
Search Service
Prompter
Eclipse
Ranked Results
Query Generation Service
Ranking Model
Bing
Blekko
Search Engines
Stack Overflow API Service
Query Generation Service
Prompter
Eclipse
Code Context
Search Engines Proxy
Search Service
Prompter
Eclipse
Ranked Results
Ranking Model
Bing
Blekko
Search Engines
Prompter
Eclipse
QueryQuery
Code Context
Query
Results
Documents
Query Generation Service
Ranking Model Stack Overflow
API Service
Prompter
Eclipse
Code Context
Search Engines Proxy
Search Service
Ranked Results
Bing
Blekko
Search EnginesQueryQuery
Code Context
Query
Results
Documents
<code> …
</code>
~ <code> …
</code>
@Overridepublic List<String> filter(final List<String> tokens) {
final List<String> stemmed = new ArrayList<String>();for(final String t : tokens){
SnowballStemmer stemmer = new englishStemmer();stemmer.setCurrent(t);stemmer.stem();stemmed.add(stemmer.getCurrent());
}return stemmed;
}
Entity Code
Code Related
Textual Similarity
Code Similarity
API Types Similarity
API Method Names Similarity
Community Related
Question Score
Accepted Answer Score
User Reputation
Tags Similarity
Code Context Similarity Features
74 Code Context
2
Retrieve Discussions(37 for Calibration)
1
Manual Classification
3 S =nX
i=1
wi · fi
nX
i=1
wi = 1
having
Model Calibration
4
Find all wi that maximize the number of relevant discussions ranked at the top
Model Calibration
S =nX
i=1
wi · fi
nX
i=1
wi = 1
having
Code Related (fi) wi
Textual Similarity 0.32
Code Similarity 0.00
API Types Similarity 0.00
API Method Names Similarity 0.30
Community Related (fi) wi
Question Score 0.07
Accepted Answer Score 0.00
User Reputation 0.13
Tags Similarity 0.18
S =nX
i=1
wi · fi
nX
i=1
wi = 1
having
Model Calibration
Validation
Study I Evaluating Recommendations Accuracy
Study II Evaluating Prompter with Developers
Validation
33 Participants (Online Survey)Industry 13
Ph.D. 9
Master 7
Bachelor 2
Faculty 2
Study I Evaluating Recommendations Accuracy
“76% of the discussions where considered related (median 4) or strongly related (median 5) by developers, while only 10% was considered as unrelated.”
Study I Summary
12 ParticipantsIndustry 6
Master 3
Bachelor 3
Study II Evaluating Prompter with Developers
Development Maintenance
Prompter Prompter
Without PrompterWithout Prompter
Development Maintenance
Prompter Prompter
Without PrompterWithout Prompter
NP P
4060
80100
Treatment
Com
pleteness
●
NP P
2040
6080
100
TreatmentCom
pleteness
Maintenance Task Development Task
NP = Without PrompterP = With Prompter
Study II Quantitative Analysis
Study II Qualitative Analysis
11 out of 12 Participants would use Prompter in their daily activities
Study II Qualitative Analysis
11 out of 12 Participants would use Prompter in their daily activities
Explicitly write and execute queries
Turn the IDE into a Self-Confident Programming Prompter
Luca Ponzanelli, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, and Michele Lanza
http://prompter.inf.usi.ch