graph-based code completion
TRANSCRIPT
![Page 1: Graph-Based Code Completion](https://reader036.vdocument.in/reader036/viewer/2022062704/55619fa0d8b42ad9538b4940/html5/thumbnails/1.jpg)
Graph-Based Pattern-Oriented, Context-Sensitive Source Code Completion
Nguyen, T.T. ; Nguyen, H.A. ; Tamrawi, A. ; Nguyen, H.V. ; Al-Kofahi, J. ; Nguyen, T.N.
Presented By: Mohammad Masudur Rahman
![Page 2: Graph-Based Code Completion](https://reader036.vdocument.in/reader036/viewer/2022062704/55619fa0d8b42ad9538b4940/html5/thumbnails/2.jpg)
2
Contents
Code Completion Thesis Statement Motivating Example Terminologies Methodology Empirical Evaluation & Results My Observation & Future Thoughts
![Page 3: Graph-Based Code Completion](https://reader036.vdocument.in/reader036/viewer/2022062704/55619fa0d8b42ad9538b4940/html5/thumbnails/3.jpg)
3
Code Completion
Built-in feature of modern all IDEs Speed up development Longer Identifier names for program comprehension Less overhead for developers Mostly single variable, method supports- API
packages Template based support – control structure, event
handling and others
![Page 4: Graph-Based Code Completion](https://reader036.vdocument.in/reader036/viewer/2022062704/55619fa0d8b42ad9538b4940/html5/thumbnails/4.jpg)
4
Thesis Statement
Novel approach with graph-based code completion
Graph based feature extracting, searching, ranking of API usage pattern, matching with editing context of current code.
Empirical evaluation shows correctness and usefulness- 95% precision, 92% recall, 93% f-score over 24 real world systems
![Page 5: Graph-Based Code Completion](https://reader036.vdocument.in/reader036/viewer/2022062704/55619fa0d8b42ad9538b4940/html5/thumbnails/5.jpg)
5
Motivating Example (Single-line)
Fig 1: Current State of Code Completion (Eclipse 3.6)
![Page 6: Graph-Based Code Completion](https://reader036.vdocument.in/reader036/viewer/2022062704/55619fa0d8b42ad9538b4940/html5/thumbnails/6.jpg)
6
Motivating Example (Multi-line)
Fig 2: SWT Usage Example
![Page 7: Graph-Based Code Completion](https://reader036.vdocument.in/reader036/viewer/2022062704/55619fa0d8b42ad9538b4940/html5/thumbnails/7.jpg)
7
Motivating Example (Query)
Fig 3: SWT Query Example
![Page 8: Graph-Based Code Completion](https://reader036.vdocument.in/reader036/viewer/2022062704/55619fa0d8b42ad9538b4940/html5/thumbnails/8.jpg)
8
Terminologies
GRAPACC API Usage Pattern Groum Based Model Context-sensitive Weight
![Page 9: Graph-Based Code Completion](https://reader036.vdocument.in/reader036/viewer/2022062704/55619fa0d8b42ad9538b4940/html5/thumbnails/9.jpg)
9
GRAPACC
Graph-Based Pattern-Oriented Context-Sensitive Code Completion
![Page 10: Graph-Based Code Completion](https://reader036.vdocument.in/reader036/viewer/2022062704/55619fa0d8b42ad9538b4940/html5/thumbnails/10.jpg)
10
API Usage Pattern
Fig 4: SWT API Usage
![Page 11: Graph-Based Code Completion](https://reader036.vdocument.in/reader036/viewer/2022062704/55619fa0d8b42ad9538b4940/html5/thumbnails/11.jpg)
11
Groum Based Model
Fig 5: Groum Conversion
![Page 12: Graph-Based Code Completion](https://reader036.vdocument.in/reader036/viewer/2022062704/55619fa0d8b42ad9538b4940/html5/thumbnails/12.jpg)
12
Context-Sensitive Weight
)1(
1)(
d
qw f
Wf (q)=Context-sensitive weight of feature q
q= feature of Query, Q
d=distance to the closest token in Groum Model
![Page 13: Graph-Based Code Completion](https://reader036.vdocument.in/reader036/viewer/2022062704/55619fa0d8b42ad9538b4940/html5/thumbnails/13.jpg)
13
Methodology
Query Processing and Feature Extraction Pattern Managing, Searching and Ranking Pattern Oriented Code Completion
![Page 14: Graph-Based Code Completion](https://reader036.vdocument.in/reader036/viewer/2022062704/55619fa0d8b42ad9538b4940/html5/thumbnails/14.jpg)
14
Query Processing and Feature Extraction
Tokenizing Partial Parsing Groum Building Feature Extracting and Weighting
![Page 15: Graph-Based Code Completion](https://reader036.vdocument.in/reader036/viewer/2022062704/55619fa0d8b42ad9538b4940/html5/thumbnails/15.jpg)
15
Tokenizing, Partial Parsing
Lexical analysis Preserves keywords related to control
structure, rest are removed elsewhere but saved
Eclipse java parser PPA tool returns AST (Abstract Syntax
Tree) Unresolved nodes assigned ‘Unknown Type’
![Page 16: Graph-Based Code Completion](https://reader036.vdocument.in/reader036/viewer/2022062704/55619fa0d8b42ad9538b4940/html5/thumbnails/16.jpg)
16
Groum Building
Groum from AST Unresolved nodes are
discarded but considered as tokens
Query converted to the following Groum
Fig 6: Groum of Query
![Page 17: Graph-Based Code Completion](https://reader036.vdocument.in/reader036/viewer/2022062704/55619fa0d8b42ad9538b4940/html5/thumbnails/17.jpg)
17
Feature Extraction & Weighting
Groum nodes mapped to tokens in tokenization step
Feature extracted from Groum for path, L<=3 3 factors contribute to feature weight Structured based factor (size) Structured based factor (centrality) User based factor
![Page 18: Graph-Based Code Completion](https://reader036.vdocument.in/reader036/viewer/2022062704/55619fa0d8b42ad9538b4940/html5/thumbnails/18.jpg)
18
Feature Extraction & Weighting
ws(q)= size based weight for feature, q of Query, Q (w(q)=1+size(q); 1<= size(q)<=3)
wc(q)= Centrality based weight for feature, q of Query, Q (wc(q)=n / s, n=no of neighbors, s=size)
(wf(q)=1/(d+1)), distance between focus node and the closest token in feature path Groum Model
![Page 19: Graph-Based Code Completion](https://reader036.vdocument.in/reader036/viewer/2022062704/55619fa0d8b42ad9538b4940/html5/thumbnails/19.jpg)
19
Feature Extraction & Weighting
w(q)= total weight for feature, q of Query, Q ws(q)= size based weight for feature, q of Query, Q
wc(q)= Centrality based weight for feature, q of Query, Q
wf(q)= used based weight for feature, q of Query, Q
![Page 20: Graph-Based Code Completion](https://reader036.vdocument.in/reader036/viewer/2022062704/55619fa0d8b42ad9538b4940/html5/thumbnails/20.jpg)
20
Pattern Managing, Searching and Ranking
Pr(P) is popularity of pattern P = frequency of Pattern P
Weight of feature p in Pattern P using inverse indexing
Np,P=occurrence of feature p in P, NP=total no of features in P
Np=No of patterns containing p, N=total no of pattern in database
![Page 21: Graph-Based Code Completion](https://reader036.vdocument.in/reader036/viewer/2022062704/55619fa0d8b42ad9538b4940/html5/thumbnails/21.jpg)
21
Pattern Managing, Searching and Ranking
For each feature p, L(p), a list of patterns from which p can be extracted
p for pattern feature, q for query feature Now sim(p,q)>∂,then p is added to F, set of mapped
features for q For each pєF, top n ranked patterns from L(p) is
added to C, candidate patterns for relevance computation
Now for each P in C, compute fit(P,Q)
![Page 22: Graph-Based Code Completion](https://reader036.vdocument.in/reader036/viewer/2022062704/55619fa0d8b42ad9538b4940/html5/thumbnails/22.jpg)
22
Feature Similarity
is a name-based similarity between two features given that feature is a collection of labels and has the formOf X.Y.Z where X=package nameY=class nameZ=method name
![Page 23: Graph-Based Code Completion](https://reader036.vdocument.in/reader036/viewer/2022062704/55619fa0d8b42ad9538b4940/html5/thumbnails/23.jpg)
23
Name-based Similarity (nsim)
wsim(X, X’) is word-based similarity X, X’ are broken down and two sequence of words
L(x) and L(y) Similarity computed as Lo/Lm
Lo is length of LCS, Lm is average length of two sequences
=
![Page 24: Graph-Based Code Completion](https://reader036.vdocument.in/reader036/viewer/2022062704/55619fa0d8b42ad9538b4940/html5/thumbnails/24.jpg)
24
Pattern Matching (Relevance)
![Page 25: Graph-Based Code Completion](https://reader036.vdocument.in/reader036/viewer/2022062704/55619fa0d8b42ad9538b4940/html5/thumbnails/25.jpg)
25
Pattern Matching
SM(P,Q)=total weight of Matched feature pair
Fit (P, Q)=Relevance degree between P and Q
Pr(P)=Popularity of Pattern P
![Page 26: Graph-Based Code Completion](https://reader036.vdocument.in/reader036/viewer/2022062704/55619fa0d8b42ad9538b4940/html5/thumbnails/26.jpg)
26
Pattern Oriented Code Completion
Matched pattern is selected and corresponding node in Groum is matched
The missing nodes are fulfilled with code
![Page 27: Graph-Based Code Completion](https://reader036.vdocument.in/reader036/viewer/2022062704/55619fa0d8b42ad9538b4940/html5/thumbnails/27.jpg)
27
Empirical Evaluation
Precision Recall F-score java.io, java.util :API used as library 28 real world open-source systems 4 for training, 24 for testing
![Page 28: Graph-Based Code Completion](https://reader036.vdocument.in/reader036/viewer/2022062704/55619fa0d8b42ad9538b4940/html5/thumbnails/28.jpg)
28
Empirical Evaluation
![Page 29: Graph-Based Code Completion](https://reader036.vdocument.in/reader036/viewer/2022062704/55619fa0d8b42ad9538b4940/html5/thumbnails/29.jpg)
29
My Observation
Planning to use semantic web technology Data and control dependency relationship
can be improved using semantic relationship like conceptual similarity
Matching of pattern is complex and error-prone, semantic score can be beneficial
![Page 30: Graph-Based Code Completion](https://reader036.vdocument.in/reader036/viewer/2022062704/55619fa0d8b42ad9538b4940/html5/thumbnails/30.jpg)
30
Thanks
Questions??