root cause analysis for html presentation failures using search-based techniques sonal mahajan,...

63
Root Cause Analysis for HTML Presentation Failures using Search- Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer Science University of Southern California

Upload: kerry-scott

Post on 11-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques

Sonal Mahajan, Bailan Li, William G.J. Halfond

Department of Computer ScienceUniversity of Southern California

Page 2: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

What is a presentation failure?• Web page rendering ≠ expected appearance

Expected appearance (oracle) Web page rendering

Page 3: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

What is a presentation failure?• Web page rendering ≠ expected appearance

Difference 1: Alignment problem

Expected appearance (oracle) Web page rendering

Page 4: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

What is a presentation failure?• Web page rendering ≠ expected appearance

Difference 2: Color problem

Expected appearance (oracle) Web page rendering

Page 5: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

What is a presentation failure?• Web page rendering ≠ expected appearance

Difference 3: Style problem

Expected appearance (oracle) Web page rendering

Page 6: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Presentation Failures

• Common in modern web applications– Highly complex– Dynamic nature of HTML, CSS, Javascript

• Difficult to diagnose and debug– Each page has hundreds of HTML elements– Each HTML element contains several styling

properties

Page 7: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Why is handling presentation failures important?

• Presentation of a website– factors company branding– gives first impression about your business

• Presentation failures can– impact usability– negative perception about quality

Page 8: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

When do presentation failures occur?

1. Front-end developer did not comply to pixel-perfect implementation [1]

2. Refactoring of UI

3. Web application was not tested sufficiently

Page 9: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Need to Debug Presentation Failures

• Throughout the development process

• 3 such scenarios -1. Presentation Development Testing2. Regression Debugging3. Standard Debugging

Page 10: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

1. Presentation Development Testing

• Front-end developers– Expected to convert mockups to “pixel perfect”

template pages

Page 11: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

“Pixel-perfect” pages… Is it reasonable?

Page 12: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

“Pixel-perfect” pages… Is it reasonable?

Page 13: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

“Pixel-perfect” pages… Is it reasonable?

Page 14: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

“Pixel-perfect” pages… Is it reasonable?

Page 15: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

“Pixel-perfect” pages… Is it reasonable?

Page 16: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

“Pixel-perfect” pages… Is it reasonable?

Page 17: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

“Pixel-perfect” pages… Is it reasonable?

Page 18: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

“Pixel-perfect” pages… Is it reasonable?

Page 19: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

“Pixel-perfect” pages… Is it reasonable?

Page 20: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

“Pixel-perfect” pages… Is it reasonable?

Page 21: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

“Pixel-perfect” pages… Is it reasonable?

Page 22: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

“Pixel-perfect” pages… Is it reasonable?

Page 23: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

1. Presentation Development Testing

• Front-end developers– Expected to convert mockups to “pixel perfect”

template pages• Back-end developers– Change templates by adding dynamic content

• Test to check if the implemented page is compliant with the given mockup

• Expected appearance (oracle) –> mockup

Page 24: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

2. Regression Debugging

• Changes to code after initial implementation– E.g.: Refactoring page from <table> based layout to

<div> based layout• Changes not intended to change appearance• Change may have direct or indirect impact• Test for presentation failures and debug to find

responsible HTML elements• Expected appearance (oracle) -> previous

correct version of the page

Page 25: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

3. Standard Debugging

• Make corrective code changes based on bug reports– E.g.: Resolve user-reported failures

• Reproduce the failure and debug• Expected appearance (oracle) -> marked

screenshot with failure area

Page 26: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

What is root cause of a presentation failure?

Root cause

Faulty HTML element

Faulty visual property

CSS property

HTML attribute

Page 27: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Limitations of Related Approaches

• Manual interaction– Browser developer tools (e.g.: Firebug)– Labor-intensive and error-prone

• Selenium, Sikuli– Require to exhaustively specify correctness invariants

• Cross-browser testing– Cannot report exact root cause – faulty visual property

• Fighting layout bugs– Cannot report a root cause and application independent

• DOM differencing– Techniques such as XBT, GUI differencing, automated oracles– Assume “golden” version of the page– Cannot be used if no golden version or DOM has changed

Page 28: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Simple Approach

• Brute force exploration of possible root cause space1. Substitute different values for each root cause2. Compare web page and oracle3. If same appearance, stop, else continue

• Limitation– Large universe of possible values

• E.g.: Margin property: [-∞, +∞]• Color property: 16 million colors

– Very expensive

Page 29: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

New Idea

• Key Insights1. Image processing defines successful search• Compare web page and oracle• Correct root cause identified

2. Image processing guides search• Fitness functions (E.g. minimizing difference pixels)

Use image processing to define root cause analysis as a search based technique

Page 30: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Mapping Root Cause Analysis to Search-based Problem

• Motivations– Large search space of root causes– Image processing to define search parameters– Availability of oracle image -> natural form of

invariant specifications

• Use genetic algorithm

Page 31: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Genetic algorithm

• Population: Possible values for a visual property• Initial population: Generated randomly• Selection: Linear ranking• Crossover: One point• Mutation: Uniform mutation• Fitness function: Minimize visual differences• Stopping criteria: web page = oracle

Page 32: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Core Idea

• Try different values for a candidate root cause• Fitness value = compare web page and oracle• If max. fitness value (web page = oracle)– Stop

• Else– Continue search

Page 33: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Example

• Candidate root cause: <div, padding>• Population: [-∞, +∞]• Initial population: {20, 50, 100, …, 0, 5}

Oracle Test web page

Page 34: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Example

Page 35: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Example

Page 36: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Example

Page 37: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Example

Page 38: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Example

Page 39: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Example

Page 40: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Example

Page 41: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Example

Match found!

Page 42: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Example

Correct root cause found!

Page 43: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Basic Technique

1. Detect presentation failure

Faulty HTML element2. Find root cause

Faulty visual property

Page 44: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Prior work: WebSee [2]

• Goal: Detect and localize presentation failures• Input: Test web page, oracle• Output: Prioritized list of HTML elements

• Phases1. Detection: Image processing techniques to find visual

differences2. Localization: Maps HTML elements to visual differences3. Result set processing: Prioritizes HTML elements based

on heuristics

Page 45: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Basic Technique

1. Detect presentation failure

Faulty HTML element 2. Find root cause

Faulty visual property

Page 46: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Classification of Visual Properties

• Effective use of search-based techniques• Define appropriate fitness function• Based on the impact on rendering of HTML

element1. Size and Position2. Color3. Predefined values

Page 47: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Category 1: Size and Position

• E.g.: margin, padding, height, width• Numeric values

• Population: [-∞, +∞]• Fitness function– Minimize number of difference pixels– Property value Number of difference pixels

Page 48: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Example

Oracle Test web page

Page 49: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Example

• e = { <div style=“padding: 10px;”>...</div> }• Number of difference pixels = 300• Value = 50px -> No. of difference pixels = 2,100• Value = 2px -> No. of difference pixels = 175

• Value = 5px -> No. of difference pixels = 0

.

.

.

Page 50: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Category 2: Color

• E.g.: text color, background-color, border-color• Color value– 140 color names– 16 million colors (#000000 to #FFFFFF)

• Population: [#000000, #FFFFFF]• Fitness function– Minimize number of difference pixels -> not useful– Determine expected color from oracle -> complex– Use minimizing color distance

Page 51: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Category 2: Color analysis (… contd.)

• Color distance: Euclidean distance between RGB• Oracleavg = Compute average color in oracle

• Testavg = Compute average color in test web page screenshot

• Color distance = dist (Oracleavg, Testavg)• Property value color distance • Final check -> full image comparison

Page 52: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Example

Oracle Test web page

Page 53: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Example

• e = { <div style=“color:#000000;”>...</div> }• Average oracle color = #FFA000• Average test screenshot color = #8E8E8E• Color distance = 369• Value = #FFFFFF -> color distance = 394• Value = #FFF000 -> color distance = 32

• Value = #FF0000 -> color distance = 0

.

.

.

Page 54: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Category 3: Predefined values

• E.g.: font-style, display, font-family, border-style• Set of discrete predefined values– font-style = {italic, oblique, normal}

• Exploration method– No notion “closeness” to guide search

• Genetic algorithm not used

– Use exhaustive exploration– Not very expensive

• max. 21 elements, • avg. 5 elements

Page 55: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Experiment

• Evaluate accuracy• Compare results with random search• Evaluated for Category 1 and 2 only

• Subject application: Gmail homepage• Oracle: Gmail homepage screenshot• Test cases: Seeded faults

Page 56: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Implementation steps

• Goal: Find root cause of presentation failure• Input:

1. P: Test web page 2. O: oracle3. E: set of potentially faulty HTML elements

(provided by WebSee)• Output: Root cause <HTML element, visual

property>

Page 57: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Implementation steps (… contd.)

1. Find possible root cause space2. Find pool of possibly correct values for each

root cause3. Use genetic algorithm to select candidate value4. Substitute selected value in web page5. Compare web page and oracle6. If web page = oracle, then return root cause7. Else, continue

Page 58: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Experimental Procedure

• Total 37 test cases• Run both, our and random, approaches 5 times

on each test case = 37 * 5 * 2 = 370 executions• Limit search space for experiment to run within

24 hours = 24 hours / 370 ≈ 3.89 min• Terminate random approach based on genetic

algorithm

Page 59: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Experimental resultsCategory RCA Random Search Test #

1. Numeric 100% 59% 30

2. Color 100% 37% 7

Total 100% 55% 37

Page 60: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Experimental resultsCategory RCA Random Search Test #

1. Numeric 100% 59% 30

2. Color 100% 37% 7

Total 100% 55% 37

• Conclusions– Validates feasibility of our search-based approach– Outperform random search

• Threats to validity– Restriction on the search space– Small sample of web applications

Category RCA Random Search Test #

1. Numeric 100% 59% 30

2. Color 100% 37% 7

Total 100% 55% 37

Page 61: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Future Work

• Improve performance– Improve search space initialization

• E.g.: For category 1, use sub-image searching

– Prioritize visual properties• Create a comprehensive search framework• Improve fitness functions• Handle limitation of presence of faulty property• Handle multiple failures• Evaluate several real web applications

Page 62: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

Summary

1. Technique for automatic root cause analysis

2. Root cause analysis mapped as a search problem

3. Helpful in debugging presentation failures

4. No HTML/CSS expertise required

5. High accuracy compared to random search

Page 63: Root Cause Analysis for HTML Presentation Failures using Search-Based Techniques Sonal Mahajan, Bailan Li, William G.J. Halfond Department of Computer

References

1. Front-end Developers Job Postings, URL: http://www-scf.usc.edu/ spmahaja/front-end-job-postings/, Apr 2014.

2. S. Mahajan and W. G. Halfond. Finding HTML Presentation Failures Using Image Comparison Techniques. In submission, 2014.