parallel webpage layout - university of california,...

24
Parallel Webpage Layout Leo Meyerovich, Chan Siu Man, Chan Siu On, Heidi Pan Krste Asanovic, Rastislav Bodik and many others from the UPCRC Berkeley project UC Berkeley

Upload: others

Post on 24-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Parallel Webpage Layout - University of California, Berkeleyparlab.eecs.berkeley.edu/sites/all/parlab/files/playout.pdf · 2010-10-05 · 2. Distilled essential CSS and wrote a declarative

ParallelWebpageLayout

LeoMeyerovich,ChanSiuMan,ChanSiuOn,HeidiPanKrsteAsanovic,RastislavBodik

andmanyothersfromtheUPCRCBerkeleyproject

UCBerkeley

Page 2: Parallel Webpage Layout - University of California, Berkeleyparlab.eecs.berkeley.edu/sites/all/parlab/files/playout.pdf · 2010-10-05 · 2. Distilled essential CSS and wrote a declarative

2

Personal Health

Image Retrieval

Hearing, Music Speech Parallel

Browser Motifs

Sketching

Legacy Code Schedulers Communication &

Synch. Primitives

ParLabResearchOverview

Legacy OS

Multicore/GPGPU

OS Libraries & Services

RAMP Manycore

Hypervisor

Cor

rect

ness

Composition & Coordination Language (C&CL)

Parallel Libraries

Parallel Frameworks

Static Verification

Dynamic Checking

Debugging with Replay

Directed Testing

Autotuners

C&CL Compiler/Interpreter

Efficiency Languages

Type Systems

Dia

gnos

ing

Pow

er/P

erfo

rman

ce

Efficiency Language Compilers

Page 3: Parallel Webpage Layout - University of California, Berkeleyparlab.eecs.berkeley.edu/sites/all/parlab/files/playout.pdf · 2010-10-05 · 2. Distilled essential CSS and wrote a declarative

ParallelWebBrowser

Whythebrowser?–  animportantapplicationplatform–  browserwarsagain:competingonperformance(latency)–  howimportant?handheldpageloadistensofCPUseconds

Whyaparallelbrowser?–  sooninyourphone?4coresx2threadsx8‐wideSIMD=64–  parallelismismoreenergyefficient

Technicalchallenge–  Parallelizethebrowsertorunwith100‐wayparallelism

Page 4: Parallel Webpage Layout - University of California, Berkeleyparlab.eecs.berkeley.edu/sites/all/parlab/files/playout.pdf · 2010-10-05 · 2. Distilled essential CSS and wrote a declarative

ThisTalk:ParallelizeSinglePageLayout

• Pagelayout(HTML+CSS)istheLaTeXoftheWeb–  latextakessecondstoformatadocument–  butpageloadshouldbe20‐100ms–  pageloadisabottleneck:51%ofCPUtimeonIE8

• Pagelayoutisachallenging“desktop”application–  notparallelizedbefore–  specifications:oftenambiguousandsequential–  low‐latency:problemsareshort‐running–  lessunderstoodmotif:treecomputation

•  Knuth:“MultiprocessorsarenohelptoTEX”

Page 5: Parallel Webpage Layout - University of California, Berkeleyparlab.eecs.berkeley.edu/sites/all/parlab/files/playout.pdf · 2010-10-05 · 2. Distilled essential CSS and wrote a declarative

OurContributions

1.  Analyzedbrowserperformance–  layoutisabottleneck;weidentifieditscriticalmotifs

2.  DistilledessentialCSSandwroteadeclarativespecforit–  crucialstepforexposingparallelismhiddenbytoday’sspec

3.  Developedfirstparallelpagelayoutalgorithms(1)matching:taskparallel,20xspeedup,stronglyscalesto16(2)solving:taskparallel,4xspeedup,stronglyscalesto3cores

4.  Futuresteps–componentsandalgorithms

Page 6: Parallel Webpage Layout - University of California, Berkeleyparlab.eecs.berkeley.edu/sites/all/parlab/files/playout.pdf · 2010-10-05 · 2. Distilled essential CSS and wrote a declarative

OverallPageLayoutProblem

p width=100%

imgwidth=100pxfloat=le4

pimg width=10px

align=right align=right

float=le4;width=10px

Input:documenttree+CSSrulesOutput:sizesandpositionsoftreenodesSteps:determinestylingrules;solveconstraints

p{width:100%}img{width:100px;float:left}pimg{width:10pt}

<body>hello<imgsrc="http:...”><p><b>world</b>okokokokok

CSSstylingrulesHTML

<body>

<p> <p>

<img>hello <b> okokokok

world

ok

+ →

x=12,y=17

XX=?,y=?

Whatthebrowserdoes

Our page layout subproblem

25% 25%

Page 7: Parallel Webpage Layout - University of California, Berkeleyparlab.eecs.berkeley.edu/sites/all/parlab/files/playout.pdf · 2010-10-05 · 2. Distilled essential CSS and wrote a declarative

Thelayoutspecisconfusing

Exampleofspec:–  “Ingeneral,theleftedgeofalineboxtouchestheleftedgeofitscontainingblock…However,@loatingboxesmaycomebetween[them].”

Hardtoimplementcorrectly,evensequentially.

Safari Firefox

Page 8: Parallel Webpage Layout - University of California, Berkeleyparlab.eecs.berkeley.edu/sites/all/parlab/files/playout.pdf · 2010-10-05 · 2. Distilled essential CSS and wrote a declarative

simplestwaytoimplementthespecseemstobeto(mostly)@lowtheelementssequentiallyinorder

Flow:sequentiallayoutintoday’sbrowsers

Page 9: Parallel Webpage Layout - University of California, Berkeleyparlab.eecs.berkeley.edu/sites/all/parlab/files/playout.pdf · 2010-10-05 · 2. Distilled essential CSS and wrote a declarative

Flowisguidedbyacursor

Cursorpointstowherenextelementgoes

world ok ok

Δ Δ

Δ

hello

ok Δ Δ

Δ

ok ok

<body>

<p>

<img>hello

<p>

<b> ok ok ok ok

world

ok

Δ Δ

Δ

Δ

Δ Δ Δ Δ Δ

Page 10: Parallel Webpage Layout - University of California, Berkeleyparlab.eecs.berkeley.edu/sites/all/parlab/files/playout.pdf · 2010-10-05 · 2. Distilled essential CSS and wrote a declarative

Flow’sdependences

<body>

<p> <p>

<img>hello <b> okokokok

world

ok

w=100,fs=12

w=50,float=le4

w=100,fs=12x=0,y=0

w=100,fs=6x=0,y=0

w=40,fs=6x=0,y=0h=10

h=10

constraintsnotspecifiedifequality(e.g.,inherited)orintrinsic(e.g.,defaultimagesizeoraspectraVo)

w=100,fs=12x=0,y=10

w=50x=0,y=10h=20

w=30,fs=12x=50,y=10h=10

h=10

w=100,fs=12x=0,y=10

h=40

h=40

fs=50%

fs, Δ, w

fs, Δ, w Δ fs,Δ,w

Δ

Δ fs, Δ, w

fs, Δ, w

fs, Δ, w

Page 11: Parallel Webpage Layout - University of California, Berkeleyparlab.eecs.berkeley.edu/sites/all/parlab/files/playout.pdf · 2010-10-05 · 2. Distilled essential CSS and wrote a declarative

Dependenciespreventparallelism

<body>

<p> <p>

<img>hello <b> okokokok

world

ok

w=200,fs=12

w=50,float=le4

w=100,fs=12x=0,y=0

w=100,fs=6x=0,y=0

w=40,fs=6x=0,y=0h=10

h=10w=100,fs=12x=0,y=10

w=50x=0,y=10h=20

w=30,fs=12x=50,y=10h=10

h=10

w=100,fs=12x=0,y=10

h=40

h=40

fs=50%

fs, Δ, w

fs, Δ, w Δ fs, Δ,w

Δ

c fs, Δ, w

fs, Δ, w

fs, Δ, w

w=40,fs=6x=0,y=0h=10

w=100,fs=12x=0,y=10

Δ

c fs, Δ, w

Page 12: Parallel Webpage Layout - University of California, Berkeleyparlab.eecs.berkeley.edu/sites/all/parlab/files/playout.pdf · 2010-10-05 · 2. Distilled essential CSS and wrote a declarative

Enableparallelismbydoingpartofwork

<body>

<p> <p>

<img>hello <b> okokokok

world

ok

w=200,fs=12

w=50,float=le4

w=100,fs=12x=0,y=0

w=100,fs=6x=0,y=0

w=40,fs=6x=0,y=0h=10

h=10w=100,fs=12x=0,y=10

w=50x=0,y=10h=20

w=30,fs=12x=50,y=10h=10

h=10

w=100,fs=12x=0,y=10

h=40

h=40

fs=50%

fs, Δ, w

fs, Δ, w Δ fs, Δ,w

Δ

Δ fs, Δ, w

fs, Δ, w

fs, Δ, w w=30,fs=12x=50,y=10h=10

fs, Δ, w

fs, Δ, w

fs, Δ, w

Page 13: Parallel Webpage Layout - University of California, Berkeleyparlab.eecs.berkeley.edu/sites/all/parlab/files/playout.pdf · 2010-10-05 · 2. Distilled essential CSS and wrote a declarative

ParallelLayoutSolving:FivePhases

ExtensiveanalysisledustofivephasesTheseenableparallelism

1. font size, tentative widths 2. preferred widths: max, min ,

3. final widths: break cycles by over-specifying CSS

4. heights, relative x/y positions

5. absolute x/y positions

Page 14: Parallel Webpage Layout - University of California, Berkeleyparlab.eecs.berkeley.edu/sites/all/parlab/files/playout.pdf · 2010-10-05 · 2. Distilled essential CSS and wrote a declarative

EachPhaseExhibitsTreeParallelism

<body>

<p> <p>

<img>hello <b> okokokok

world

ok

w=100,fs=12

float=le4

fs=6

fs=12

fs=12

fs=6fs=12

fs=12

fs=12wp=40wm=40 wp=50

wm=50

wp=30,wm=30

wp=10wm=10

wp=80wm=30

wp=80,wm=40

wp=30wm=30

wp=40wm=40

fs=12

<body>

<p> <p>

<img>hello <b> okokokok

world

ok

Phase 1: font size, temporary width Phase 2: preferred max & min width Phase 3: width Phase 4: height, relative x/y position

fs=50%

Phase 5: absolute x/y position

Page 15: Parallel Webpage Layout - University of California, Berkeleyparlab.eecs.berkeley.edu/sites/all/parlab/files/playout.pdf · 2010-10-05 · 2. Distilled essential CSS and wrote a declarative

ParallelLayout:SpeculativeEvaluation

• Didnotbreakdependenciesforfloats–  mightstickoutofparagraphs

world ok ok

ok ok ok

world ok ok

ok ok ok

hello

hello

Page 16: Parallel Webpage Layout - University of California, Berkeleyparlab.eecs.berkeley.edu/sites/all/parlab/files/playout.pdf · 2010-10-05 · 2. Distilled essential CSS and wrote a declarative

ParallelLayout:SpeculativeEvaluation

• Didnotbreakdependenciesforfloats–  mightstickoutofparagraphs

•  Speculate:assumenofloats•  Check• Patchupasneeded

world ok ok ok ok ok

hello

hello

world ok ok

ok ok ok

Page 17: Parallel Webpage Layout - University of California, Berkeleyparlab.eecs.berkeley.edu/sites/all/parlab/files/playout.pdf · 2010-10-05 · 2. Distilled essential CSS and wrote a declarative

ParallelLayout:SpeculativeEvaluation

• Didnotbreakdependenciesforfloats–  mightstickoutofparagraph

•  Speculate:assumenofloats•  Check• Patchupasneeded

–  floatsrare–  Webelieveoverflowis

minimal world ok ok

ok ok ok

world ok ok

ok ok ok

hello

hello

Page 18: Parallel Webpage Layout - University of California, Berkeleyparlab.eecs.berkeley.edu/sites/all/parlab/files/playout.pdf · 2010-10-05 · 2. Distilled essential CSS and wrote a declarative

BerkeleyStyleSheetLayoutLanguage

•  CancompileessentialCSSintoit• RefactoredCSStoseparatefeatures•  Simplifies:correctness,parallelization,use

Page 19: Parallel Webpage Layout - University of California, Berkeleyparlab.eecs.berkeley.edu/sites/all/parlab/files/playout.pdf · 2010-10-05 · 2. Distilled essential CSS and wrote a declarative

Analysis

• Model:sequentialspeed~=Firefoxspeed•  Cilk++:4xspeedup,scalesto3cores• NeedtoSIMDizeleaves

<body>

<p> <p>

<img>

hellooo

<b>

ok

ok 0

1

2

3

4

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Aver

age

Spee

dup

# Hardware Threads

Modeled Speedup w/Cilk++

Eight socket x 4 core AMD Opteron 2356 Barcelona Sun X4600

Dual socket x 4 core AMD Opteron 2356 Barcelona Sun X2200

Preproduction 2 socket x 4 core x 2 thread Intel Xeon Nehalem

Page 20: Parallel Webpage Layout - University of California, Berkeleyparlab.eecs.berkeley.edu/sites/all/parlab/files/playout.pdf · 2010-10-05 · 2. Distilled essential CSS and wrote a declarative

• Matching–  Tagpath(img:<body><p><img>)–  RuleSelectors–  Foreachtagpath:which

selectorsare~substrings?

• Ruleresolution–  Prioritizepropertiesby

ruleorder:loweroverrides

RuleMatching:ProblemStatement

width=100pxfloat=le4

<body>

<p> <p>

<img>hello <b> ok ok ok ok

world

ok

selectors p img pimg

proper*es width=100% width=100pxfloat=le4

width=100px

width=10px

Page 21: Parallel Webpage Layout - University of California, Berkeleyparlab.eecs.berkeley.edu/sites/all/parlab/files/playout.pdf · 2010-10-05 · 2. Distilled essential CSS and wrote a declarative

•  ~600nodes,1000srules• Assignnodestocores

–  loadbalancing:randomassignment

•  SIMDizable?

RuleMatching:Parallelization

<body>

<p> <p>

<img>hello <b> ok ok ok ok

world

ok

selectors p img pimg

proper*es width=100% width=100pxfloat=le4

width=100px

Page 22: Parallel Webpage Layout - University of California, Berkeleyparlab.eecs.berkeley.edu/sites/all/parlab/files/playout.pdf · 2010-10-05 · 2. Distilled essential CSS and wrote a declarative

Analysis

• Results–  perfectscaling:upto10cores–  20xspeedupon32cores–  …butwithpython

•  interp.overhead(seq.)• procs.,notthreads

•  Future–  C++implementation–  SIMDrulematching 0

4

8

12

16

20

24

28

32

1 4 8 16 32

Aver

age

Spee

dup

# Hardware Threads

Slashdot

Rotten Tomatoes

Wikipedia

NY Times

8 socket x 4 cores AMD Opteron 2356 Barcelona

Speedup vs # Cores (w/ Python)

Page 23: Parallel Webpage Layout - University of California, Berkeleyparlab.eecs.berkeley.edu/sites/all/parlab/files/playout.pdf · 2010-10-05 · 2. Distilled essential CSS and wrote a declarative

Takeaways

• Artifacts–  BSS/CSSspecification&dependencydecomposition–  4xsolvingspeedup(untuned),20xmatching(inpython)

•  Lessons–  4x<<100xSIMDizelow‐levellibraries(e.g.,fonts)–  motifs:lowlatencytreeops,vectors,pixelblending–  attributegrammarshelped

• Nextsteps–  tunetasks,SIMDkernels,biggerscopeofmodel–  implicationsforconcurrentscriptsusinglayout?

Page 24: Parallel Webpage Layout - University of California, Berkeleyparlab.eecs.berkeley.edu/sites/all/parlab/files/playout.pdf · 2010-10-05 · 2. Distilled essential CSS and wrote a declarative

(questions?)