towards aiding within-patch information foraging by end-user programmers balaji athreya, chris...

15
Towards aiding within-patch information foraging by end-user programmers Balaji Athreya, Chris Scaffidi Oregon State University

Upload: damon-norman

Post on 02-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Towards aiding within-patch information foraging by end-user programmers Balaji Athreya, Chris Scaffidi Oregon State University

Towards aiding within-patch information

foraging by end-user programmers

Balaji Athreya, Chris ScaffidiOregon State University

Page 2: Towards aiding within-patch information foraging by end-user programmers Balaji Athreya, Chris Scaffidi Oregon State University

Finding information in code is difficult

• Information Foraging Theory (IFT) describes two main costs that people incur when finding information in an information environment

• Navigation between “patches,” which for programmers can be thought of as methods connected by navigable links• Processing information “features” within patches, such as

the words in the source code of methods

Page 3: Towards aiding within-patch information foraging by end-user programmers Balaji Athreya, Chris Scaffidi Oregon State University

Strategies to help programmers find information in code• Help programmers navigate between patches

• By automatically generating useful links (e.g., Gilligan), by letting user ask questions (e.g., WhyLine), or by supporting new kinds of search (e.g., CodeFinder)

• Strategy has been extensively tested with professional and end-user programmers

• Help programmers understand patches• By highlighting code similarities (e.g., Jigsaw), or by summarizing important

statements and eliding unimportant statements (e.g., Sridhara’s tool) • Strategy has been extensively tested with professional programmers• A similar strategy, narrowly, has been used to help EUPs to find bugs

Page 4: Towards aiding within-patch information foraging by end-user programmers Balaji Athreya, Chris Scaffidi Oregon State University

Tools based on the latter strategy apply heuristics to identify important statements• For example, Sridhara’s algorithm: Given a particular

method…• Classify statements based on whether they…

• Only invoke void-return methods• Invoke methods with names similar to current method• Return from the current method• Facilitate data flow in particular ways among above statements• Perform control flow (e.g., if, loop) in particular ways of statements above

• Identify statements as “important”• If they meet the criteria above

• And they are not in exception handler or variable initializer• And (for control flow statements) they do not have an empty “else” block

Page 5: Towards aiding within-patch information foraging by end-user programmers Balaji Athreya, Chris Scaffidi Oregon State University

But does increasing relative visual weight of important information features help EUPs?• Perhaps no• Write-only code: EUPs’ code is often written to be used once• Nardi: “It is not clear whether users who modify existing example

programs could ever really come to understand the programs they modify”

• Perhaps yes• EUPs often have less training than professional programmers• Maybe people who have low levels of training would

benefit even more than professional programmers

Page 6: Towards aiding within-patch information foraging by end-user programmers Balaji Athreya, Chris Scaffidi Oregon State University

Two prototypes: Each manipulates relative visual weight in a different way• Both prototypes identify “important” statements• Sridhara’s algorithm, applied to TouchDevelop, with no

interesting changes

• Prototype #1: Highlight important statements• Different colors depending on statement type

• Prototype #2: Hide unimportant statements• Display just line number, for toggling hide/unhide

Page 7: Towards aiding within-patch information foraging by end-user programmers Balaji Athreya, Chris Scaffidi Oregon State University

Study to evaluate impact of these prototypes• Between-subject design, with three treatments• #1: Using prototype 1 (highlighting important)• #2: Using prototype 2 (hiding unimportant)• #3: Using baseline (standard TouchDevelop tool)

• Key comparisons are #1 vs #3 and #2 vs #3• To understand the impact of each method of manipulating

relative visual weight, independently, versus the baseline• The goal is not to compete the prototypes against one another

Page 8: Towards aiding within-patch information foraging by end-user programmers Balaji Athreya, Chris Scaffidi Oregon State University

Task: Find information in 3 different programs• Programs varied in size

• To help clarify if any impacts were due to between-patch navigation

• Ecological validity• Size: a bit on long side (earlier study: 75% programs <= 100 LOC)• Purpose: quite typical (33% were utilities, 11% games, 7% images)

Program Lines of code Number of methods Program purpose

Small 38 1 Guessing game

Medium 122 5 Setting a timer

Large 326 24 Editing an image

Page 9: Towards aiding within-patch information foraging by end-user programmers Balaji Athreya, Chris Scaffidi Oregon State University

Analyzing 3 measures of impact

• Created program comprehension questions based on our earlier study of EUPs’ historical TouchDevelop maintenance tasks• Approximately 1 question for every 50 lines of code, 6 total

• Measures of impact (per program)• Questions correct, time taken (sec), efficiency (ratio of these scaled times

1000)

• Kruskal-Wallis nonparametric statistical test with a two-tailed ANOVA of each measure rank versus two factors: treatment and program• Robust to outliers, small samples, and non-normal distributions• So we can see impact of the treatment (tool) while controlling for program

size

Page 10: Towards aiding within-patch information foraging by end-user programmers Balaji Athreya, Chris Scaffidi Oregon State University

Data from 30 pre-CS students

• Very early in their training (taking freshman or sophomore CS courses)

• Inclusion criteria• Understand English, adults, and never professional programmers

• Had to drop data from 1 of the 31 participants (who didn’t know English)

Page 11: Towards aiding within-patch information foraging by end-user programmers Balaji Athreya, Chris Scaffidi Oregon State University

Results: Highlighting helped, but hiding harmed;all effects were consistent across program size

Measure Highlight

P value Direction & F score

Score 0.01 + 7.35

Time 0.70 — 0.12

Efficiency 0.03 + 5.06

Measure Hide

P value Direction & F score

Score 0.08 — 3.15

Time 0.28 + 1.14

Efficiency 0.04 — 4.33

Page 12: Towards aiding within-patch information foraging by end-user programmers Balaji Athreya, Chris Scaffidi Oregon State University

Further study could address threats to validity• What if the algorithm doesn’t find truly important statements?

• In the case of this experiment, we manually verified that it did find all the statements needed for answering the questions.

• But, supposing some situation where it missed truly important statements…• Highlighting might not help as much… further study needed• Hiding probably would hurt even more (due to hiding truly important

statements)

• Would these results generalize to EUPs “in the wild”?• In the case of our experiment, participants were novices and EUPs

• No study has really looked at novices vs EUPs… further study needed• Tasks were based on typical TouchDevelop tasks

• But didn’t study effects on tasks that use large data structures… further study needed

Page 13: Towards aiding within-patch information foraging by end-user programmers Balaji Athreya, Chris Scaffidi Oregon State University

Conclusions and implications for theory

• Highlighting important helped, but hiding unimportant information hurt

• The “unimportant” statements apparently contributed to obtaining value from the important statements• Value of important statements was contingent on unimportant statements• IFT currently lacks any notion of contingent information feature value

• The optimal foraging path might not be the one that goes straight to the needed information.• Sometimes, optimal != shortest.

Page 14: Towards aiding within-patch information foraging by end-user programmers Balaji Athreya, Chris Scaffidi Oregon State University

Conclusions and implications for tool-building• For coding: Highlighting important statements might help EUPs to

understand their own code better and avoid creating bugs

• For version control: Highlighting important statements in “diffs” might help EUPs to better understand key differences between versions

• For debugging: Highlighting upcoming or past important statements during step-wise execution might aid understanding and finding bugs

• For reuse: Highlighting important statements in existing code might help EUPs understand its purpose and aid in deciding whether to reuse

Page 15: Towards aiding within-patch information foraging by end-user programmers Balaji Athreya, Chris Scaffidi Oregon State University

Thank you…

• To you for your attention, interest, and ideas• To the VL/HCC reviewers for your compliments and

suggestions• To Microsoft for lending TouchDevelop-equipped phones• To the National Science Foundation for funding