pdf.js at swissjeese 2012

44
Julian Viereck @jviereck +julian.viereck

Upload: julian-viereck

Post on 11-Nov-2014

1.707 views

Category:

Technology


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: PDF.JS at SwissJeese 2012

Julian Viereck

@jviereck+julian.viereck

Page 2: PDF.JS at SwissJeese 2012

Overview

• What is PDF.JS about

• How PDF is structured & processing in PDF.JS

• “Why are you doing this?”

• Firefox Integration

• What’s next?

• Demo

• Q & A

5

10

15

5

5

15

5

Page 3: PDF.JS at SwissJeese 2012

BespinSkywriter

Ace

FirefoxDevTools

ETH Zurich

(Physics)PDF.JS

?

About me

Page 4: PDF.JS at SwissJeese 2012

PDF Viewerusing

OpenWebStandards

Page 5: PDF.JS at SwissJeese 2012

What is PDF.JS

• building faithful & efficient PDF viewer

• HTML5 technology experiment

• no native code

• secure (web sandbox)

• Mozilla Labs Project - Open Source (Github)

Page 6: PDF.JS at SwissJeese 2012

What is PDF.JS

• Not Firefox-Specific - all modern browsers

• 1.3 MB uncompressed JS

• ~ 33`000 lines of code

• viewer in different languages

• async API

Page 7: PDF.JS at SwissJeese 2012

root objID, xRef byte offset

root obj = ref to pages catalog

How PDF is structuredHeader

Body

[Objects]

xRef Table

Trailer

sequence of objets

fonts, drawing cmds, images, words, bookmarks, form fields

mapping objID ⇔ byte offset

PDF version

PDF file

Page 8: PDF.JS at SwissJeese 2012

Let’s look at it

Page 9: PDF.JS at SwissJeese 2012

CanvasGraphics

PartialEvaluator

Processing in PDF.JS

• get plain Uint8Array via XHR2, build Stream

• new PDFDoc(stream): read xRef, root object

• page = PDFDoc.getPage(N)

• page.startRendering(graphics)

• read & convert all PDF cmds ➟ OL

• load required objects (fonts, images)

• graphics.executeOperatorList(OL)

OperationList

Page 10: PDF.JS at SwissJeese 2012

Execution ExamplePartial

Evaluator

draw(obj#3, dict.x, dict.y

)

“get page 2”Data

Graphics

buildsobj#3?dict.x, .y?

obj#3 = ”foo”x = 20y = 30

draw oncanvas

drawing cmds

Page 11: PDF.JS at SwissJeese 2012

Problem Processing

• Extracting data slow (compressed)

• Transform data (images) slow

• Sometimes a lot of objects on page

➡ Freezes UI

➡ Use WebWorker

➡ :( no direct memory access, postMessage

Page 12: PDF.JS at SwissJeese 2012

PartialEvaluator

draw(obj#3, dict.x, dict.y

)

Data

Graphics

builds

draw oncanvas

Data“get page 2”

data

draw(“foo”, 20, 30

)

MainThread

Web Worker

OpListOperation

List + Data

Page 13: PDF.JS at SwissJeese 2012

setGState: [ LW: 10 ]dependency: [ font0 ]setFont: font0, 12beginTextmoveText: 100, 700showText: “Hello World!”endTextmoveTo: 50, 600lineTo: 400, 600stroke

5 0 obj<< /Length 8 0 R>> stream /GS1 gs /F0 12 Tf BT 100 700 Td (Hello World!) Tj ET 50 600 m 400 600 l S endstreamendobj Graphics

PartialEvaluator xRef, catalog, resources+ OL

Page 14: PDF.JS at SwissJeese 2012

Images• JPEG streams:

• DOMImg.src = 'data:image/jpeg;base64,' + window.btoa(bytesToString(bytes));

• If not JPEG stream:

• read bytes, convert to colorspace

• imgData = canvas.getImageData()

• fillWithPixelData(bytes, imgData)

• canvas.putImageData(imgData)

Page 15: PDF.JS at SwissJeese 2012

Jpeg, but...

• no natives support for Jpeg 2000, CMYK

➡ use JS implementation

‣ works, not that performant but good enough

Page 16: PDF.JS at SwissJeese 2012

Fonts

• There are lots of different font formats!

• fonts are converted to OpenType

• use CSS for loading: @font-face { font-family:'font0'; src:url(data:font/opentype;base64, ...)

• Fonts are sanitized by browser

• Need to rebuild malformed fonts :/

Page 17: PDF.JS at SwissJeese 2012

“Why are you doing this?”

aka. ∃ C/C++ libraries= isn’t that faster?

Page 18: PDF.JS at SwissJeese 2012

“Performance is not the only measure”

Page 19: PDF.JS at SwissJeese 2012

1. Security

Page 20: PDF.JS at SwissJeese 2012

Most vulnerable programs

Source: http://www.csis.dk/en/csis/news/3321

Page 21: PDF.JS at SwissJeese 2012

~ 25% crashes in Firefox are Plugin related

Page 22: PDF.JS at SwissJeese 2012

2. WebSpecific Viewer

Page 23: PDF.JS at SwissJeese 2012

3. Drive Innovation

Page 24: PDF.JS at SwissJeese 2012

4. Speed

Page 25: PDF.JS at SwissJeese 2012

4. Speed

• Rendering slower then C/C++

• BUT

• Partial downloading

• Render page in background

• Make slow become faster

• Mostly: Good enough

Page 26: PDF.JS at SwissJeese 2012

5. Can do better

Page 27: PDF.JS at SwissJeese 2012

6. Push WebPlatform

Page 28: PDF.JS at SwissJeese 2012

B2G aka. Boot2Gecko

Page 29: PDF.JS at SwissJeese 2012
Page 30: PDF.JS at SwissJeese 2012
Page 31: PDF.JS at SwissJeese 2012
Page 32: PDF.JS at SwissJeese 2012
Page 33: PDF.JS at SwissJeese 2012

New API: Printing

• Printing very limited on the web right now

• no way to achieve native printing experience

• NEED: New API for printing

• mozPrintCallback

• define canvas content during printing

• send drawing commands directly to printer

Page 34: PDF.JS at SwissJeese 2012

WebPagePrint

Single Pages

Page 35: PDF.JS at SwissJeese 2012

• Find print canvas on page

• Execute printCallback

• All canvas done ➠ print page

Page 1

Page 2

Page 36: PDF.JS at SwissJeese 2012

canvas.mozPrintCallback

Page 37: PDF.JS at SwissJeese 2012

Firefox Integration

Page 38: PDF.JS at SwissJeese 2012

Firefox Integration

• PDF.JS as bundled Addon in Firefox Nightly

• Getting in Release Channel is hard

• 400M users have expectations

• more testing coverage

• accessibility

• match UX expectation

• fallback if something is not working

Page 39: PDF.JS at SwissJeese 2012

Firefox Integration

• Try to make it till Aurora Merge (6/5)

• Firefox Specific, BUT

• improving quality browser independent

• only small parts Firefox specific

Page 40: PDF.JS at SwissJeese 2012

What’s next

• Fix broken PDFs

• Improve performance

• Improve Text selection

• Text search

• Form support

• Printing support

Page 41: PDF.JS at SwissJeese 2012

Demo

Page 42: PDF.JS at SwissJeese 2012

Contributing

• Lots of areas

• Translation

• Writing Code (embeddable viewer?)

• Testing (Firefox Auto-Update Addon)

Page 44: PDF.JS at SwissJeese 2012

Q & A