jszap – compressing javascript · 2019-02-25 · gaurav sinha, iit kanpur . a web 2.0 application...

27
JSZap: Compressing JavaScript Code Martin Burtscher, UT Austin Ben Livshits & Ben Zorn, Microsoft Research Gaurav Sinha, IIT Kanpur

Upload: others

Post on 12-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: JSZap – Compressing JavaScript · 2019-02-25 · Gaurav Sinha, IIT Kanpur . A Web 2.0 Application Dissected 70,000+ lines of JavaScript code downloaded 2,855 Functions 1+ MB code

JSZap: Compressing JavaScript Code

Martin Burtscher, UT Austin

Ben Livshits & Ben Zorn, Microsoft Research

Gaurav Sinha, IIT Kanpur

Page 2: JSZap – Compressing JavaScript · 2019-02-25 · Gaurav Sinha, IIT Kanpur . A Web 2.0 Application Dissected 70,000+ lines of JavaScript code downloaded 2,855 Functions 1+ MB code

A Web 2.0 Application Dissected

70,000+ lines of JavaScript code

downloaded 2,855 Functions

1+ MB code

Talks to 14 backend services

(traffic, images, directions, ads, …)

2

Page 3: JSZap – Compressing JavaScript · 2019-02-25 · Gaurav Sinha, IIT Kanpur . A Web 2.0 Application Dissected 70,000+ lines of JavaScript code downloaded 2,855 Functions 1+ MB code

Lots of JavaScript being Transmitted

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

www.live.com

spreadsheets.google

maps.live

chi.lexigame

hotmail

gmail

dropthings

maps.google

pageflakes

bunny hunt

Fraction of download that is JavaScript

3

Up to 85% of a Web 2.0

app is JavaScript code!

Page 4: JSZap – Compressing JavaScript · 2019-02-25 · Gaurav Sinha, IIT Kanpur . A Web 2.0 Application Dissected 70,000+ lines of JavaScript code downloaded 2,855 Functions 1+ MB code

AJAX: Tension Headaches

4

Execution can’t start without

the code

Move code to client for

responsiveness

Page 5: JSZap – Compressing JavaScript · 2019-02-25 · Gaurav Sinha, IIT Kanpur . A Web 2.0 Application Dissected 70,000+ lines of JavaScript code downloaded 2,855 Functions 1+ MB code

JavaScript on the Wire

JavaScript crunch

gzip -d parser AST

JSZap

gzip

5

Page 6: JSZap – Compressing JavaScript · 2019-02-25 · Gaurav Sinha, IIT Kanpur . A Web 2.0 Application Dissected 70,000+ lines of JavaScript code downloaded 2,855 Functions 1+ MB code

JSZap Approach

• Represent JavaScript as AST instead of source

• Serialize the compressed AST

• Decompress directly into AST on client

• Use gzip as 2nd-level (de-)compressor

6

Page 7: JSZap – Compressing JavaScript · 2019-02-25 · Gaurav Sinha, IIT Kanpur . A Web 2.0 Application Dissected 70,000+ lines of JavaScript code downloaded 2,855 Functions 1+ MB code

Benefits of AST-based Compression

• Compression: less to transmit

• ASTs are blasted directly into the browser

Reduced Latency

• Reduces mobile charges

• Reduces operator network costs: better for servers

Reduced Network Bandwidth

• Ensures well-formedness of code

• Can use to check language subsets easily (AdSafe)

• Caching incremental updates

• Unblocking HTML parser

Correctness, Security, and other Benefits

7

Page 8: JSZap – Compressing JavaScript · 2019-02-25 · Gaurav Sinha, IIT Kanpur . A Web 2.0 Application Dissected 70,000+ lines of JavaScript code downloaded 2,855 Functions 1+ MB code

JSZap Compression

JavaScript JSZap gzip

8

Page 9: JSZap – Compressing JavaScript · 2019-02-25 · Gaurav Sinha, IIT Kanpur . A Web 2.0 Application Dissected 70,000+ lines of JavaScript code downloaded 2,855 Functions 1+ MB code

JSZap Compression

JavaScript identifiers gzip

literals

productions 1

2

3

9

Page 10: JSZap – Compressing JavaScript · 2019-02-25 · Gaurav Sinha, IIT Kanpur . A Web 2.0 Application Dissected 70,000+ lines of JavaScript code downloaded 2,855 Functions 1+ MB code

10

GZIP is a formidable

opponent

Page 11: JSZap – Compressing JavaScript · 2019-02-25 · Gaurav Sinha, IIT Kanpur . A Web 2.0 Application Dissected 70,000+ lines of JavaScript code downloaded 2,855 Functions 1+ MB code

JSZap vs. GZIP

11

5.4 5.4

18.4 19.0

8.4 11.5

0

5

10

15

20

25

30

35

40

JSZapgzip

Size

in K

B

Literals Identifiers Productions

Page 12: JSZap – Compressing JavaScript · 2019-02-25 · Gaurav Sinha, IIT Kanpur . A Web 2.0 Application Dissected 70,000+ lines of JavaScript code downloaded 2,855 Functions 1+ MB code

Talk Outline

identifiers

literals

productions 1

2

3

evaluation on real code

12

Page 13: JSZap – Compressing JavaScript · 2019-02-25 · Gaurav Sinha, IIT Kanpur . A Web 2.0 Application Dissected 70,000+ lines of JavaScript code downloaded 2,855 Functions 1+ MB code

Background: ASTs

a * b + c 1) E E + T

2) E T

3) T T * F

4) T F

5) F id

+

*

a b

c 5

5

1

3

5

13

Expression Grammar Tree

Page 14: JSZap – Compressing JavaScript · 2019-02-25 · Gaurav Sinha, IIT Kanpur . A Web 2.0 Application Dissected 70,000+ lines of JavaScript code downloaded 2,855 Functions 1+ MB code

A Simple Javascript Example var y = 2;

function foo () {

var x = "jscrunch";

var z = 3;

z = y + y;

}

x = "jszap";

Identifier Stream

y foo x z z y y x

Literal Stream

"jscrunch" 2 3 "jszap" 14

Production Stream

1 3 4 ... 1 3 4 ...

Page 15: JSZap – Compressing JavaScript · 2019-02-25 · Gaurav Sinha, IIT Kanpur . A Web 2.0 Application Dissected 70,000+ lines of JavaScript code downloaded 2,855 Functions 1+ MB code

Benchmarking JSZap

Benchmark name Source lines

Source bytes

gmonkey 922 17,382

getDOMHash 1,136 25,467

bing1 3,758 77,891

bingmap1 3,473 80,066

livemsg1 5,307 93,982

bingmap2 9,726 113,393

facebook1 5,886 141,469

livemsg2 7,139 156,282

officelive1 22,016 668,051

• JavaScript files up to 22K LOC

• Variety of app types

• Both hand-generated, and machine-generated

• gzipped everything

15

Page 16: JSZap – Compressing JavaScript · 2019-02-25 · Gaurav Sinha, IIT Kanpur . A Web 2.0 Application Dissected 70,000+ lines of JavaScript code downloaded 2,855 Functions 1+ MB code

Components of JavaScript Source

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

gmo

nke

y

getD

OM

Has

h

bin

g1

bin

gmap

1

livem

sg1

bin

gmap

2

face

bo

ok1

livem

sg2

off

icel

ive1

productions identifiers literals

16

• None of the categories can be ignored

• Identifiers become more prominent with code growth

Page 17: JSZap – Compressing JavaScript · 2019-02-25 · Gaurav Sinha, IIT Kanpur . A Web 2.0 Application Dissected 70,000+ lines of JavaScript code downloaded 2,855 Functions 1+ MB code

Compressing the Production Stream

• Frequency-based production renaming

• Differential encoding: 26 and 57 => 2 and 3

• Chain rule: eliminate predictable productions

• Tree-based prediction-by-partial-match

17

Page 18: JSZap – Compressing JavaScript · 2019-02-25 · Gaurav Sinha, IIT Kanpur . A Web 2.0 Application Dissected 70,000+ lines of JavaScript code downloaded 2,855 Functions 1+ MB code

PPMC

• Consider compressing

– if (P) then X else X

• Should be very compressible • if (P) then ...abc... else ...abc...

18

P

X X

• Tree context used to build a predictor

• Provides the next likely child node given context C and child position p

• Arithmetic coding: more likely=shorter IDs

• See paper for details

Page 19: JSZap – Compressing JavaScript · 2019-02-25 · Gaurav Sinha, IIT Kanpur . A Web 2.0 Application Dissected 70,000+ lines of JavaScript code downloaded 2,855 Functions 1+ MB code

Production Compression with PPMC

0.6772

50%

55%

60%

65%

70%

75%

80%

85%

90%

95%

100%gm

on

key

getD

OM

Has

h

bin

g1

bin

gmap

1

livem

sg1

bin

gmap

2

face

bo

ok1

livem

sg2

off

icel

ive1Pro

du

ctio

n C

om

pre

ssio

n (

gzip

= 1

)

19

Page 20: JSZap – Compressing JavaScript · 2019-02-25 · Gaurav Sinha, IIT Kanpur . A Web 2.0 Application Dissected 70,000+ lines of JavaScript code downloaded 2,855 Functions 1+ MB code

Compressing the Identifier Stream

• Symbol tables instead of identifier stream:

– Compress redundancy: offset into table

– Global or local symbol tables

– Use variable-length encoding

• Other techniques:

– Sort symbols by frequency

– Rename local variables

20

Page 21: JSZap – Compressing JavaScript · 2019-02-25 · Gaurav Sinha, IIT Kanpur . A Web 2.0 Application Dissected 70,000+ lines of JavaScript code downloaded 2,855 Functions 1+ MB code

Variable-length Encoding for Identifiers

is global?

is renamed local

00…

01…

fits in 1 byte?

11…

10…

21

Page 22: JSZap – Compressing JavaScript · 2019-02-25 · Gaurav Sinha, IIT Kanpur . A Web 2.0 Application Dissected 70,000+ lines of JavaScript code downloaded 2,855 Functions 1+ MB code

Variable-Length Identifier Encoding

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

gmo

nke

y

getD

OM

Has

h

bin

g1

bin

gmap

1

livem

sg1

bin

gmap

2

face

bo

ok1

livem

sg2

off

icel

ive1

parent

local 2byte

local 1byte

local builtin

global 2byte

global 1byte

22

Page 23: JSZap – Compressing JavaScript · 2019-02-25 · Gaurav Sinha, IIT Kanpur . A Web 2.0 Application Dissected 70,000+ lines of JavaScript code downloaded 2,855 Functions 1+ MB code

Symbol Tables: Effectiveness

0.943

89%

80%

85%

90%

95%

100%gm

on

key

getD

OM

Has

h

bin

g1

bin

gmap

1

livem

sg1

bin

gmap

2

face

bo

ok1

livem

sg2

off

icel

ive1

Ide

nti

fie

rs (

No

ST =

1)

Global ST VarEnc

23

Page 24: JSZap – Compressing JavaScript · 2019-02-25 · Gaurav Sinha, IIT Kanpur . A Web 2.0 Application Dissected 70,000+ lines of JavaScript code downloaded 2,855 Functions 1+ MB code

Compressing Literals

• Symbol tables

• Grouping literals by type

• Pre-fixes and post-fixes

• These techniques result in 5-10% savings compared to gzip

24

Page 25: JSZap – Compressing JavaScript · 2019-02-25 · Gaurav Sinha, IIT Kanpur . A Web 2.0 Application Dissected 70,000+ lines of JavaScript code downloaded 2,855 Functions 1+ MB code

Average JSZap Compression: 10%

0.8792

80%

82%

84%

86%

88%

90%

92%

94%

96%

98%

100%gm

on

key

getD

OM

Has

h

bin

g1

bin

gmap

1

livem

sg1

bin

gmap

2

face

bo

ok1

livem

sg2

off

icel

ive1

JSZa

p C

om

pre

ssio

n (

gzip

= 1

)

25

Productions, 26%

Identifiers, 57%

Literals, 17%

13% savings

Page 26: JSZap – Compressing JavaScript · 2019-02-25 · Gaurav Sinha, IIT Kanpur . A Web 2.0 Application Dissected 70,000+ lines of JavaScript code downloaded 2,855 Functions 1+ MB code

Summary and Conclusions

• JSZap: AST-based compression for JavaScript

• Propose a range of techniques for compressing – Productions – Identifiers – Literals

• Preliminary results are encouraging: 10% savings over gzip

• Future focus

– Latency measurements – Browser integration

26

Page 27: JSZap – Compressing JavaScript · 2019-02-25 · Gaurav Sinha, IIT Kanpur . A Web 2.0 Application Dissected 70,000+ lines of JavaScript code downloaded 2,855 Functions 1+ MB code

Well-formedness

Security (AdSafe)

AST representation

Unblocking HTML parser

Caching and incremental

updates

Compression with JSZap

27

?

Questions?