hacking with hhvm
DESCRIPTION
Given at php tek 2014 by Elizabeth Smith as replacement for SaraTRANSCRIPT
Hacking with HHVM
Sara [email protected] 5, 2014
1 A quick recap
2 PHP is WebScale
3 Sure it’s fast but…
4 Extending HHVM
5 Conclusion / Resources
Agenda
A quick recap“What is HHVM?”
HHVM is not a source code transformerThat was HPHPc, it’s dead.
▪ Runs your PHP pages live,just like PHP
▪ Uses a builtin web serveror FastCGI – Runs anywhere(on 64-bit x86 linux)
▪ Drop-in replacement for PHP(mostly)
Webserver(apache, nginx,
etc…)
Database(mysql, posges,mongo, redis,
etc…)
cart.php
home.php
login.php
index.php
PHPAPC
HHVM is not a source code transformerThat was HPHPc, it’s dead.
▪ Runs your PHP pages live,just like PHP
▪ Uses a builtin web serveror FastCGI – Runs anywhere(on 64-bit x86 linux)
▪ Drop-in replacement for PHP(mostly)
Webserver(apache, nginx,
etc…)
Database(mysql, posges,mongo, redis,
etc…)
cart.php
home.php
login.php
index.php
HHVM
HHVM supports (most) PHP syntaxTracking HEAD
▪ Pending: Splat (5.6)(Variadics work already)
And some of its own
▪ Scalar type hint(and much much more)
▪ Async co-routines
▪ Generics
▪ Collections (smart arrays)
▪ XHP (Integrated XHTML)
▪ User Attributes
HHVM is easy to installIf you’re on Ubuntu
▪ deb http://dl.hhvm.com/ubuntu saucy main
▪ apt-get update
▪ apt-get install hhvm (or hhvm-nightly)
▪ Provides one binary covering cli, fcgi server, and libevent server
▪ Coming very soon to a Debian near you!
Or something Debianish…
HHVM is buildableOn other linux distros (MacOSX in interp mode only)
• http://hhvm.com/repo/wiki
• gcc 4.8 or later (soon to be 4.8 or later)
• Boost 1.49 or later
• Lots of other dependencies….
• git clone [email protected]:facebook/hhvmcmake .make –j
• hphp/hhvm/hhvm
xkcd.org/303
Running a serverBuilt-in HTTP server FastCGIServer { Port = 80 Type = libevent SourceRoot = /var/www}Log { Level = Error UseLogFile = true File = /var/log/hhvm-error.log Access { * { File = /var/log/hhvm-access.log Format = %h %l %u %t \”%r\” %>s %b }}}
VirtualHost {…}StaticFile {…}
Server { Port = 9000 Type = fastcgi SourceRoot = /var/www}Log { Level = Error UseLogFile = true File = /var/log/hhvm-error.log Access { * { File = /var/log/hhvm-access.log Format = %h %l %u %t \”%r\” %>s %b }}}
Requires patched libevent
Running a FastCGI servernginx HHVMserver { server_name www.example.com;
root /var/www; index index.php;
location ~ \.php$ { fastcgi_pass unix:/var/run/hhvm.sock fastcgi_index index.php; fastcgi_param SCRIPT_FILENAME /var/www$fastcgi_script_name; include fastcgi_param; }}
Server { FileSocket = /var/run/hhvm.sock Type = fastcgi SourceRoot = /var/www}Log { Level = Error UseLogFile = true File = /var/log/hhvm-error.log Access { * { File = /var/log/hhvm-access.log Format = %h %l %u %t \”%r\” %>s %b }}}
PHP is webscale HHVM’s JIT is the secret sauce
HHVM – Bytecode interpreter
• PHP5 style bytecode execution
• APC-like caching of bytecodes
• Perf about on par with PHP
Modified? Invalidate
Cache
Compile to
Bytecode
RunBytecode
Y
N
HHVM – Native code JIT
• Bytecodes run a few times “cold”
• Variable type inference
• Hotpath detection
• Transform to native code
Modified? Invalidate
Cache
Compile to
Bytecode
Have native?
Run Native
Hot?
RunBytecode
Compile to
Native
Y
Y
Y
N
N
N
HHVM – Repo Authoritative Mode
• “Production Mode”
• Improved offline pre-analysis
• Assumes no changes
• Another 10% or so perf gain
PreCompile
Bytecode
Have native?
Run Native
Hot?
RunBytecode
Compile to
Native
Y
Y
N
N
But don’t take my word for it…
Magento (Daniel Sloof)https://t.co/UB1aOzJ73c
Magento (Daniel Sloof)https://t.co/UB1aOzJ73c
Symfony(Christian Stocker)
Requests per Secondhttp://bit.ly/1fCRw99
Symfony(Christian Stocker)
Page load Timehttp://bit.ly/1fCRw99
I heard PHP is getting a JIT…
▪ phpng ▪ Translates Zend bytecode to LLVM intermediate
▪ Compiles to machine code which gets run directly
▪ Yes, it’s faster than PHP 5.6, but not faster than HHVM
▪ LLVM is a general purpose JIT. HHVM was designed for PHP’s “quirks”
▪ HHVM’s variable types were laid out with JITting in mind,PHP is stuck with the zval
▪ ZE must use a caller frame. HHVM issues direct CPU fcalls
▪ HHVM has access to Hack type hinting/inference
Sure it’s fastBut what else can it do?
HACK (yes, it’s a horrible name…)Type Hinting gone mad
▪ Scalars
▪ bool, int, float, num, string
▪ Return typehints
▪ Typed properties
▪ Constructor arg promotion
▪ Specialization
▪ array<int>
▪ array<string,array<string,stdClass>>
Static Analysis
▪ Analyze entire code treewithout running
▪ Report type errors like astrictly typed language
▪ Fallback on loose typingwhen needed(because that’s what PHP is good at)
HACK<?hh
class Foo { private int $num = 123;
public function add(int $delta): Foo { $this->num += $delta; return $this; }
public function get(): int { return $this->num; }
public function __constructor(int $num): void { $this->num = $num; }}
$f = new Foo(123);$f->add(456);$f->add(“banana”);
Basic Hack
▪ Static Analyzer traces valuethrough whole program
▪ Looks for mismatches andreports errors
▪ “Foo::add() expected int”“Foo::add() received string”
HACK – Constructor arg promotion<?hh
class Foo {
public function add(int $delta): Foo { $this->num += $delta; return $this; }
public function get(): int { return $this->num; }
public function __constructor( private int $num) : void { }
}
$f = new Foo(123);echo $f->add(456)->get();
Avoid repeated patterns
▪ public/protected/privateas __construct arg modifiers
▪ Declares the propertyand initializes from value
▪ Less boilerplate,more coding
HACK – Generics<?hh
class Foo<T> { public function add(T $delta): Foo { $this->num += $delta; return $this; }
public function get(): T { return $this->num; }
public function __constructor( private T $num) : void { }
}
$i = new Foo<int>(123);echo $i->add(456)->get();
$f = new Foo<float>(3.14);echo $f->add(2.17)->get();
Specialize common code
▪ T is replaced w/ specialization type at instantiation
▪ Typecheker propagates replacement through generic
▪ Less boilerplate,more coding
HACK – Collections<?hh
function foo(Vector<int> $nums, Set<string> $names, Map<int,string> $numNameMap, FrozenVector<int> $nums2): bool { foreach($nums as $num) { $mappedName = $numNameMap[$num]; if ($names->contains($mappedName)) { return true; } } if ($nums2->count() == 0) return true; return false;}
Specialized array objects
▪ Vector, Set, Map, StableMap
▪ Frozen*Mutable* (default)
▪ Support ArrayAccess
▪ Extensive library of methods
Async Functions<?php
// NOT ACTUAL SYNTAX// Way over-simplified to fit on a slide
async function getPage($url) { $fp = fopen($url, ‘r’); await $fp; return stream_get_contents($fp);}
$pages = await [ getPage(‘http://php.net’), getPage(‘http://example.com’), getPage(‘http://hhvm.com’),];
Cooperative Multitasking
▪ Parallelizing made easy
▪ Allow multiple functions to run “simultaneously”
▪ While one is blocking, the other executes
User Attributes<?php
<<Author(“sgolemon”),Clowny>>function life() { return 42;}
$rf = new ReflectionFunction(‘life’);print_r($rf->getAttributes());print_r($rf->getAttribute(‘Author’));
Array ( [Author] => Array ( [0] => sgolemon ) [Clowny] => Array ( ))Array ( [0] => sgolemon)
User-defined metadata
▪ Arbitrary labels surfaced via reflection
▪ Basically what it says on the tin
XHP – XHtml for PHP<?php if (validate($_GET[‘id’], $_GET[‘pwd’])) loginAndRedirectToHome(); exit; }?><html> <head><title>Login</title></head> <body> <form> ID: <input type=text name=“username” value=“{$_GET[‘id’]}” /> Pass: <input type=password name=“password” value=“{$_GET[‘pwd’]}” /> </form> </body></html>
Easy noob mistake
▪ http://example.com/?id=“><script>alert(“Gotcha!”)</script>
▪ Trivial vector for stealing user’s cookies (and login credentials)
XHP – XHtml for PHP<?php if (validate($_GET[‘id’], $_GET[‘pwd’])) loginAndRedirectToHome(); exit; }
include ‘xhp/init.php’;
$form = <form> ID: <input type=text name=“username” value=“{$_GET[‘id’]}” /> Pass: <input type=password name=“password” value=“{$_GET[‘pwd’]}” /> </form>;
echo <html> <head><title>Login</title></head> <body>{$form}</body> </html>;
Markup as 1st class syntax
▪ Parser can tell what’s userdata,and what isn’t
▪ Catch most XSS automatically
▪ Formalizes component modularity
XHP – XHtml for PHP<?phpclass :mysite:footer extends :xhp:html-element { attribute enum {‘en’,’es’} lang = ‘en’; attribute string prefix; public function stringify() { $home = ($this->lang == ‘es’) ? ‘Casa’ : ‘Home’; $footer = <div class=“footer”> <a href=“{$this->prefix}/”>$home</a> <a href=“{$this->prefix}/contact.php”> Contact Us </a> etc… </div>; return (string)$footer; }}echo <html> <body> Blah Blah <mysite:footer lang=‘es’ /> </body> </html>;
Markup as 1st class syntax
▪ Custom components
▪ Compose from other tagsor generate custom xhtml
XHP – XHtml for PHP<?php
class :blink extends :xhp:html-element { category %flow, %phrase; children (pcdata | %phrase); protected $tagName = ‘blink’;}
echo <html> <body> <blink>UNDER CONSTRUCTION</blink> <mysite:footer lang=‘es’ /> </body> </html>;
Markup as 1st class syntax
▪ Can also contain other tags
▪ Default :xhp:html-element behavior makes simple tags easy
▪ Replace :blink’s stringify to use javascript since browsers disable
HPHPd – HPHP Debugger
• Interactive shell
• GDB-like debugging
• Standalone or w/ server
• Breakpoints
• Watches
• Macros
Extending HHVMYou *can* do that in PHP
Writing a PHP Extension…zend_bool array_column_param_helper(zval **param, const char *name TSRMLS_DC) { switch (Z_TYPE_PP(param)) { case IS_DOUBLE: convert_to_long_ex(param); case IS_LONG: return 1; case IS_OBJECT: convert_to_string_ex(param); case IS_STRING: return 1; default: php_error_docref(NULL TSRMLS_CC, E_WARNING, "The %s key should be either a string or an integer", name); return 0; }}PHP_FUNCTION(array_column) { zval **zcolumn = NULL, **zkey = NULL, **data; HashTable *arr_hash; HashPosition pointer;
if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "hZ!|Z!", &arr_hash, &zcolumn, &zkey) == FAILURE) { return; } if ((zcolumn && !array_column_param_helper(zcolumn, "column" TSRMLS_CC)) || (zkey && !array_column_param_helper(zkey, "index" TSRMLS_CC))) { RETURN_FALSE; } array_init(return_value); for (zend_hash_internal_pointer_reset_ex(arr_hash, &pointer); zend_hash_get_current_data_ex(arr_hash, (void**)&data, &pointer) == SUCCESS; zend_hash_move_forward_ex(arr_hash, &pointer)) { zval **zcolval, **zkeyval = NULL; HashTable *ht; if (Z_TYPE_PP(data) != IS_ARRAY) { continue; } ht = Z_ARRVAL_PP(data); if (!zcolumn) { zcolval = data; } else if ((Z_TYPE_PP(zcolumn) == IS_STRING) && (zend_hash_find(ht, Z_STRVAL_PP(zcolumn), Z_STRLEN_PP(zcolumn) + 1, (void**)&zcolval) == FAILURE)) { continue; } else if ((Z_TYPE_PP(zcolumn) == IS_LONG) && (zend_hash_index_find(ht, Z_LVAL_PP(zcolumn), (void**)&zcolval) == FAILURE)) { continue; } if (zkey && (Z_TYPE_PP(zkey) == IS_STRING)) { zend_hash_find(ht, Z_STRVAL_PP(zkey), Z_STRLEN_PP(zkey) + 1, (void**)&zkeyval); } else if (zkey && (Z_TYPE_PP(zkey) == IS_LONG)) { zend_hash_index_find(ht, Z_LVAL_PP(zkey), (void**)&zkeyval); } Z_ADDREF_PP(zcolval); if (zkeyval && Z_TYPE_PP(zkeyval) == IS_STRING) { add_assoc_zval(return_value, Z_STRVAL_PP(zkeyval), *zcolval); } else if (zkeyval && Z_TYPE_PP(zkeyval) == IS_LONG) { add_index_zval(return_value, Z_LVAL_PP(zkeyval), *zcolval); } else if (zkeyval && Z_TYPE_PP(zkeyval) == IS_OBJECT) { SEPARATE_ZVAL(zkeyval); convert_to_string(*zkeyval); add_assoc_zval(return_value, Z_STRVAL_PP(zkeyval), *zcolval); } else { add_next_index_zval(return_value, *zcolval); } }}
The same function in HHVM<?php
function array_column(array $arr, $col, $key = null) { $ret = []; foreach($arr as $key => $val) { if (!is_array($val)) continue; $cval = ($col === null) ? $val : $val[$col]; if ($key === null || !isset($val[$key])) {
$ret[] = $cval; } else { $ret[$val[$key]] = $cval; } } return $ret;}
Crossing the PHP->C++ barrier
Array HHVM_FUNCTION(array_column, CArrRef arr, CVarRef col, CVarRef key) { Array ret; for (auto &pair : arr) { Variant key = pair.first, val = pair.second; if (val.isArray()) continue; Array aval = val.toArray(); Variant cval = col.isNull() ? aval : aval[col]; if (key.isNull() || !aval.exists(key)) { ret.append(cval); } else { ret[aval[key]] = cval; } } return ret;}
Blurring the PHP/C++ line
class MyClass { public static function escapeString(string $str): string { return str_replace(“’”, “\\’”, $str); }
<<__Native>> public funtion query(string $sql): array;}
Array HHVM_METHOD(MyClass, query, const String& sql) { // Call C/C++ library function and return result… // Called object available as: CObjRef this_ // Static calls come through: Class *self_}
Questions ?
Resources
• http://hhvm.com/repo
• http://hhvm.com/blog
• http://hhvm.com/twitter
• http://docs.hhvm.com
• hphp/doc in git repository for Options and Technical… Stuff
• Freenode / #hhvm
• http://hhvm.com/fb/page
• http://hhvm.com/fb/general
• @HipHopVM
• @HackLang