formal, executable semantics of web languages: javascript and php

20
Formal, Executable Semantics of Web Languages: JavaScript and PHP Sergio Ma!eis Imperial College London In collaboration with: J. Mitchell (Stanford), A. Taly (Google), K. Bhargavan, M. Bodin, A. Charugeraud, A. Delignat-Lavaud, A. Schmitt (INRIA), D. Filaretti, P. Gardner, D. Naudziuniene, G. Smith, S. Yuwen (Imperial) FACE Kick-O! Meeting, Verona, March 2014

Upload: face

Post on 04-Jul-2015

608 views

Category:

Education


0 download

TRANSCRIPT

Page 1: Formal, Executable Semantics of Web Languages: JavaScript and PHP

Formal, Executable Semantics

of Web Languages:

JavaScript and PHP

Sergio Ma!eis Imperial College London

In collaboration with: J. Mitchell (Stanford), A. Taly (Google), K. Bhargavan, M. Bodin, A. Charugeraud, A. Delignat-Lavaud, A. Schmitt (INRIA), D. Filaretti, P. Gardner, D. Naudziuniene, G. Smith, S. Yuwen (Imperial) FACE Kick-O! Meeting, Verona, March 2014

Page 2: Formal, Executable Semantics of Web Languages: JavaScript and PHP

Web Application (in)Security

Many components, trust boundaries, possible attacks.

User Browser!

Advertising network! Facebook

user data!

Untrusted application!Java/PHP!

SQL!

JavaScript!

HTTP protocol!

DOM libraries!

Facebook Server!

!""#$%&$#

'()*+(),-.#

"/0#%12.34(1#

'"56#

Page 3: Formal, Executable Semantics of Web Languages: JavaScript and PHP

Language-Based Web Security

•  First, build formal models (this talk) •  Next, analyze/enforce security properties •  Based on:

– JSSec: small-step operational semantics of ES3 – JSCert: Coq semantics and interpreter of ES5 – KPHP: formal executable semantics of PHP in K

Page 4: Formal, Executable Semantics of Web Languages: JavaScript and PHP

JavaScript and PHP

•  Born as small languages – JavaScript: sanitize input of HTML forms – PHP: Personal Home Page Tools for tracking

home page visits

•  Now achieved world domination – All web pages, most servers – Top of Github/StackOveflow popularity

•  Chart from http://langpop.corger.nl

•  Picked up lots of complexity along the way

Page 5: Formal, Executable Semantics of Web Languages: JavaScript and PHP

•  Critical points of failure for web security –  Attacks come from obscure, di"cult corner cases –  Do not leave out tricky or inelegant constructs

•  OK to look at conservative subsets

–  But beware of unsound simplifications

–  .

JavaScript and PHP

Page 6: Formal, Executable Semantics of Web Languages: JavaScript and PHP

Libraries

•  JavaScript, PHP = the brains •  Browser, server libs = the muscle •  We need operational semantics

of the core language – Plus a mechanism to invoke library

functions

•  Formalization of libraries is an independent task – Di!erent goals, techniques – One language, many libraries

Page 7: Formal, Executable Semantics of Web Languages: JavaScript and PHP

Formalization: The Pain

Page 8: Formal, Executable Semantics of Web Languages: JavaScript and PHP

Mechanization: The Gain

Page 9: Formal, Executable Semantics of Web Languages: JavaScript and PHP

Trusting the Formalization

•  JSSec: manual execution (not scalable) – Experiments with various browsers – Driven by corner cases of specification

•  JSCert: Coq to OCAML extraction – JSRef + proof: significant overhead, but trusted – Systematic validation of JSRef using test262

•  KPHP: semantics is directly executable – PHP has no analogous to ES3/5 specification –  (Zend) test-driven semantics development

Page 10: Formal, Executable Semantics of Web Languages: JavaScript and PHP

PHP: What is a Bug?

•  Evaluation order of expressions: LR or RL?

•  PHP bug 61188

Page 11: Formal, Executable Semantics of Web Languages: JavaScript and PHP

PHP: What is a Bug?

•  Formal semantics explains what happens

– Evaluation order is LR – Array accesses are evaluated to values – Variables are evaluated to references – References are resolved lazily

•  Easy fix to expose LR evaluation consistently – BinOp(E1,E2) ! BinOp(R, E2) ! BinOp(V,E2)

Page 12: Formal, Executable Semantics of Web Languages: JavaScript and PHP

Meta-Proofs

•  JSSec: paper proof, labor intensive, error-prone

•  JSCert: Coq proof, even more labor, but trusted

•  Useful for debugging the semantics •  Basis for further proofs

–  Coq proof: 6 months to find the right way, 3 days to do

Page 13: Formal, Executable Semantics of Web Languages: JavaScript and PHP

Secure sandboxing

Untrusted application

User data Facebook server

Examples: online ads, mashups, social networks.

Page 14: Formal, Executable Semantics of Web Languages: JavaScript and PHP

– Prevent access to blacklist: document, window,eval, myAPI,… •  Rewrite e1[e2] with run-time monitor e1[IDX(e2)].

– Key idea: emulate semantics. !

!e1[e2] ! v1[e2] ! v1[v2] ! o[v2] ! o[m] !!

!IDX(e2) = ($=e2,! {toString: function(){$=$String($); ! return ($B[$]?"bad":$);}}) – Theorem: subset prevents access to the identifiers

in a given blacklist. – Practical impact: attacks on

A sandboxed JS subset

7'"689:;<"=5%'"89:>#

Page 15: Formal, Executable Semantics of Web Languages: JavaScript and PHP

Defensive web components

Page 16: Formal, Executable Semantics of Web Languages: JavaScript and PHP

–  A JS subset for security-sensitive components.

–  Conformance verified via static type inference.

–  We built fast defensive libraries •  Cryptography, JSON parsing.

–  Practical impact: attack that steals master password of

DJS: a Defensive JS subset

7?"<@%!8AB>#

Page 17: Formal, Executable Semantics of Web Languages: JavaScript and PHP
Page 18: Formal, Executable Semantics of Web Languages: JavaScript and PHP
Page 19: Formal, Executable Semantics of Web Languages: JavaScript and PHP

Conclusions

•  Toy models of programming languages – Ok for new language features, analysis ideas. –  Inadequate to provide security guarantees

•  Full-blown formal semantics – Basis for trustworthy verification, certification. – Tools and techniques are now mature enough.

Page 20: Formal, Executable Semantics of Web Languages: JavaScript and PHP

References

•  JSSec: –  Semantics: APLAS’08, http://jssec.net/semantics –  Secure subsets: CSF’09, ESORICS’09, OAKLAND’10 –  Program logics: POPL’12 –  Defensive JavaScript: USENIX’13, http://defensivejs.com

•  JSCert: –  POPL’14 –  http://jscert.org, https://github.com/jscert/jscert

•  KPHP: –  ECOOP’14 –  http://phpsemantics.org (to be updated soon)