ibm research © 2009 ibm corporation highly scalable web applications with zero-copy data transfer...

30
IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura , Michiaki Tatsubori, Scott Trent, Akihiko Tozawa, and Tamiya Onodera Tokyo Research Laboratory IBM Research 2009/4/22 17:00-17:30 Web-Eng 3 - Web Architecture Aspect

Upload: jacob-lamb

Post on 27-Mar-2015

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko

IBM Research

© 2009 IBM Corporation

Highly Scalable Web Applications with Zero-Copy Data Transfer

Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko Tozawa, and Tamiya OnoderaTokyo Research LaboratoryIBM Research

2009/4/22 17:00-17:30 Web-Eng 3 - Web Architecture Aspect

Page 2: IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko

IBM Research

© 2009 IBM Corporation2

Executive Summary

Scale-up approach still matters for ultimate scalability– The performance of server-side applications is becoming increasingly

important as more applications exploit the Web application model in the Web 2.0 era

Our Main Contribution– A novel approach to boost Web application performance with what we

call “Zero Copy Data Transfer” that reduces redundant memory copying and context switch overhead between user space and kernel space

Achievements 126% performance improvement with micro-benchmarks 31% performance improvement for standard Web

benchmark, SPECweb2005

Page 3: IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko

IBM Research

© 2009 IBM Corporation3

Outline of this talk

1. Motivation

2. Our Solution: Zero Copy Data Transfer

3. Performance Evaluation

4. Related Work, Future Direction and Conclusions

Page 4: IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko

IBM Research

© 2009 IBM Corporation4

Background : Web Server Performance Does Matter

In recent years, software applications are increasingly being developed to adopt the Web application model via HTTP protocol in the Web 2.0 era.

This rapidly growing use is dramatically increasing the performance requirements for Web application servers.

Gigantic internet companies such as Microsoft, Google, Amazon.com, and Yahoo! have introduced ten thousands of physical hosts to maintain QoS and scalability

Total Sites Across All Domains August 1995 - March 2009 (Netcraft)

http://news.netcraft.com/archives/web_server_survey.html

Page 5: IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko

IBM Research

© 2009 IBM Corporation5

Optimizing Web Server Performance

Static Web Server (Only serving static files) – In kernel web servers [Almol, TOCS ‘04] [King,USENIX ‘01]– Zero-copy Approach [Lighttpd 1.5 beta, ’08]

Dynamic Web Server (including business logic) = HTTP Server + Connector (SAPI)

+ Script Execution Runtime – Scale-Out Approach with Load Balancing – Optimizing Script Execution Runtime with Just-in-time compiler

[Tozawa, PHP ’08] [YARV] – Dynamic Web Server Comparison : PHP vs. JSP [Trent,

Middleware 2008] • Lighttpd / FastCGI / PHP was the best !

How can we make web sites more scalable to support users ?

We want more.

Any other optimization ?

Page 6: IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko

IBM Research

© 2009 IBM Corporation6

Profiling Dynamic Web Server with SPECweb2005

Dynamic web server consists of Lighttpd Web Server, PHP Runtime (P9), and FastCGI as SAPI

The profiling result shows that significant time is spent on memory copying both in Web server and PHP runtime

37%

21%

5%

13%

24%

runtime librarylibc- 2.6.socompiled codeextension libraryother

CPU Usage in PHP (P9) (SPECweb Banking)

53999 9.4771 libc-2.6.so lighttpd memcpy39009 6.8463 libc-2.6.so phoebe-fcgi memcpy34101 5.9849 e1000 lighttpd (no symbols)23828 4.1819 libcrypto.so.0.9.8b lighttpd bn_mul_add_words20247 3.5534 libp9rtsvc24.so phoebe-fcgi storeGenericAux19645 3.4478 libcrypto.so.0.9.8b lighttpd bn_sqr_comba812944 2.2717 libcrypto.so.0.9.8b lighttpd BN_from_montgomery11547 2.0266 libc-2.6.so lighttpd _int_malloc10737 1.8844 libp9rtsvc24.so phoebe-fcgi loadIndex

CPU Usage (%) Memory

copy

Page 7: IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko

IBM Research

© 2009 IBM Corporation7

Sniffing FastCGI Packets in SPECweb Banking

\0x01\0x01\0\0x01\0\0x08\0\0\0\0x01\0\0\0\0\0\0\0x01\0x04\0\0x01\0x05T\0\0\0x0F\0x0FSERVER_SOFTWARElighttpd/1.4.18\0x0B\0x19SERVER_NAMEmichis3.trl.ibm.com:8099\0x11\0x07GATEWAY_INTERFACECGI/1.1\0x0B\0x04SERVER_PORT8099\0x0B\0x0CSERVER_ADDR9.116.14.105\0x0B\0x04REMOTE_PORT3456\0x0B\0x0BREMOTE_ADDR9.116.14.91\0x0B\0x19SCRIPT_NAME/bank/account_summary.php\0x09\0PATH_INFO\0x0FFSCRIPT_FILENAME/home/suzumura/software/lighttpd/var/www/html/bank/account_summary.php\r.DOCUMENT_ROOT/home/suzumura/software/lighttpd/var/www/html/\0x0B\0x19REQUEST_URI/bank/account_summary.php\0x0C\0QUERY_STRING\0x0E\0x03REQUEST_METHODGET\0x0F\0x03REDIRECT_STATUS200\0x0F\0x08SERVER_PROTOCOLHTTP/1.1\0x09\0x19HTTP_HOSTmich-is3.trl.ibm.com:8099\0x0FYHTTP_USER_AGENTMozilla/5.0 (Windows; U; Windows NT 5.1; ja; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11\0x0BcHTTP_ACCEPTtext/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5\0x14\0x17HTTP_ACCEPT_LANGUAGEja,en-us;q=0.7,en;q=0.3\0x14\0x0CHTTP_ACCEPT_ENCODINGgzip,deflate\0x13\0x1DHTTP_ACCEPT_CHARSETShift_JIS,utf-8;q=0.7,*;q=0.7\0x0F\0x03HTTP_KEEP_ALIVE300\0x0F\nHTTP_CONNECTIONkeep-alive\0x0CFHTTP_REFERERhttp://mich-is3.trl.ibm.com:8099/bank/check_detail_html.php?check_no=1\0x0B\0x80\0\0x01VHTTP_COOKIECoreID6=46881926841911901663357; w3ibmProfile=2005081020414106874663086|gASP|760|488|null; sauidp=U162807841190166314000; s_nr=1190095558906; ibmSurvey=1211076852290; CoreID6=46881926841911901663357; w3ibmProfile=2005081020414106874663086|gASP|760|488|null; sauidp=U162807841190166314000; s_nr=1190095558906; ibmSurvey=1211076852290;userid=1\0x01\0x04\0\0x01\0\0\0\0\0x01\0x05\0\0x01\0\0\0\0

\0x01\0x06\0\0x01\0x1F\0xF8\0\0Set-Cookie: CoreID6=46881926841911901663357;p

ath=/\r\nSet-Cookie: w3ibmProfile=2005081020414106874663086|gASP|760|488|null;pa

th=/\r\nSet-Cookie: sauidp=U162807841190166314000;path=/\r\nSet-Cookie: s_nr=119

0095558906;path=/\r\nSet-Cookie: ibmSurvey=1211076852290;path=/\r\nSet-Cookie: u

serid=1;path=/\r\nContent-Type: text/html\r\n\r\n<!DOCTYPE html PUBLIC "-//W3C//

DTD HTML 4.01 Transitional//EN" \r\n"http://www.w3.org/TR/html4/loose.dtd">\r\n<

html>\r\n <head>\r\n <title>SPECweb2005: Account Summary</title>\r\n </head

>\r\n <body bgcolor="white">\r\n <!-- SPECweb2005 Dynamic Data Area -->\r\n <

table summary="SPECweb2005_User_Id">\r\n <tr><th>User ID</th></tr>\r\n <t

r><td>1</td></tr> \r\n </table>\r\n <table summary="SPECweb2005_Acct_Summary"

cellpadding=3 border=1>\r\n <tr>\r\n <th>Account</th>\r\n <th>Type<

/th>\r\n <th>Current Balance</th>\r\n <th>Total Deposits</th>\r\n

<th>Average Deposit</th>\r\n <th>Total Withdraws</th>\r\n <th>Average

Withdraws</th>\r\n </tr>\r\n <tr>\r\n <td>0000000251</td>\r\n

<td> Saving\r\n </td>\r\n <td>9373.01</td>\r\n

<td>73.01</td>\r\n <td>73.01</td>\r\n <td>67.01</td>\r\n <td>6

7.01</td>\r\n\0x09</tr>\r\n <tr>\r\n <td>0000000252</td>\r\n

<td> Other\r\n </td>\r\n <td>9373.01</td>\r\n <td

>173.01</td>\r\n <td>73.01</td>\r\n <td>117.01</td>\r\n <td>67.01

</td>\r\n\0x09</tr>\r\n </table> \r\n <!-- SPECweb2005 Displayable Page

Title -->\r\n <table>\r\n <tr>\r\n <td><b><font color="#9400d3">SPECwe

b2005: Account Summary</font></b></td>\r\n </tr>\r\n </table>\r\n <!-- SPEC

web2005 User Action Area -->\r\n \0x09<ul>\r\n\0x09 <li><a href="account_summa

ry.php">Account Summary</a>\r\n\0x09 <li><a href="check_detail_input.php">Check

Detail</a>\r\n <li><a href="profile.php">Change Profile</a>\r\n <li><

a href="transfer.php">Transfer Money</a>\r\n\0x09 <li>Bill Pay\r\n <ul>\

r\n <li><a href="add_payee.php">Add Payee</a>\r\n <li><a href=

"bill_pay.php">Quick Pay</a>\r\n <li><a href="bill_pay_status_input.php

">Check Status</a>\r\n </ul>\r\n\0x09 <li><a href="order_check.php">Orde

r Check</a>\r\n <li><a href="logout.php">Logout</a>\r\n\0x09</ul>\r <!-- S

PECweb2005 Image References -->\r\n <!-- SPECweb2005 Embedded Text -->\r\n<pre>

HTTP server

PHP Runtime

account_summary.phpin SPECweb Banking

Response: 19352 bytes (19KB)

Request: 1381 bytes

FastCGI

Page 8: IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko

IBM Research

© 2009 IBM Corporation8

Relatively Static Part(Cached File)

Simplified Dynamic Web Page

Semi-Static Part(Cached File, Long Characters, etc)

Footer Part

DB

Header Part

File System

Dynanmic Part

Page 9: IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko

IBM Research

© 2009 IBM Corporation9

Interaction between HTTP Server and PHP Runtime

HTTP Server

PHP Runtime

Kernel Space

User Space

socket

Buffer

PHP Appheader_processing();

echo file_get_contents(‘fileA’);

footer_processing();

File

FastCGI

File System

Buffer

BufferFile

Content

Page 10: IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko

IBM Research

© 2009 IBM Corporation10

Outline of this talk

1. Background & Motivation

2. Our Solution: Zero Copy Data Transfer

3. Performance Evaluation

4. Related Work, Future Direction and Conclusions

Page 11: IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko

IBM Research

© 2009 IBM Corporation11

Our Solution: Zero Copy Data Transfer

Reduces inter-process communication overhead between web server and PHP runtime

Normally, PHP reads file contents, converts it to string buffer, and forwards it as a FastCGI packet.

Instead, do the following– PHP runtime passes the “file name” within a FastCGI packet.

– Web server then uses the information to invoke a zero copy system call such as sendfile, which is supported by major operating systems

Note that PHP developers need not modify their scripts to use our optimization

Page 12: IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko

IBM Research

© 2009 IBM Corporation12

Proposed ApproachHTTP Server

PHP Runtime

Kernel Space

User Space

socket

Buffer

PHP Appheader_processing();

echo file_get_contents(‘fileA’);

footer_processing();

File

FastCGI

File System

Buffer

BufferFile URI

X-ZeroCopy

X-ZeroCopyHandling

sendfile

Page 13: IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko

IBM Research

© 2009 IBM Corporation13

File Processing : FTCS (File-Type Character String) Object

A new type of character string object that only holds the file name (URI) of a file without reading the entire content of a file as an ordinary character string object

abcdefghijklmnopqrstuvwxyz

abcdefghijklmnopqrstuvwxyz

PHP Runtime

abcdefghijklmnopqrstuvwxyz URI: fileAPHP

Runtime

Ordinal Character

String Object File(fileA)

File(fileA)

FTCS Object

Page 14: IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko

IBM Research

© 2009 IBM Corporation14

X-ZeroCopy: Enhancing FastCGI Protocol

.X-ZeroCopy : New HTTP Header (HTTP Extension) – File name and location are included in the body portion of a

FastCGI message– The header is recursively defined to allow the transmission of

multiple files

Our PHP runtime automatically generates an X-ZeroCopy header and body content in a transparent manner from unmodified PHP scripts

X-ZeroCopy = “X-ZeroCopy” “:” # (offset “/” length )

Page 15: IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko

IBM Research

© 2009 IBM Corporation15

Example

X-ZeroCopy: 5/10

hello/tmp/A.htmworld

<?php

echo “hello”; $a = file_get_contents(‘/tmp/A.html’); echo $a; echo “world”;?>

FTCS(A.html)

PHPRuntime

Page 16: IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko

IBM Research

© 2009 IBM Corporation16

Transparency to Applications

PHP developers need not modify their applications to leverage our proposed approach

If there is a side effect on FTCS object, our PHP runtime loads the content of a while when needed (Lazy I/O Processing)– In this case, performance improvement can not be obtained

<?php $a = file_get_content(“fileA”);

$b = trim($a); // remove white space?>

Page 17: IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko

IBM Research

© 2009 IBM Corporation17

Outline of this talk

1. Background & Motivation

2. Our Solution: Zero Copy Data Transfer

3. Performance Evaluation

4. Related Work, Future Direction and Conclusions

Page 18: IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko

IBM Research

© 2009 IBM Corporation18

Performance Evaluation

Micro-benchmark– Theoretically our approach should be effective when the

FastCGI communication overhead between the PHP runtime and the Web server is a major performance bottleneck.

– To find the threshold where file size becomes a bottleneck, we prepared a simple PHP script micro-benchmark that simply displays a file (ranging from 10 KB to 200 KB) via the file_get_contents PHP extension.

SPECweb2005– A standard web benchmark consisting of 3 web representative

scenarios: Banking, Ecommerce, and Support

Page 19: IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko

IBM Research

© 2009 IBM Corporation19

Comparative PHP Runtimes: P9 vs. P9ZC vs. Zend

P9 – Our research PHP runtime with Just-in-time compiler – Single Thread + 8 FastCGI Processes

P9ZC– P9 with our proposed zero-copy data transfer – Single Thread + 8 FastCGI Processes

Zend (PHP 5.2.5) (Only used for SPECweb) – A major PHP runtime available from www.php.net– APC (Alternative PHP Cache) is turned on to allow a

PHP intermediate code to be cached in shared memory– Single Thread + 8 FastCGI Processes

Page 20: IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko

IBM Research

© 2009 IBM Corporation20

Micro-benchmark: Throughput and Speedup (%) of P9ZC over P9 with varying file sizes.

File Size large

Speedup

P9ZC

•The speedup of P9ZC over P9 increases from 1.26 for a 10 K file up to 2.26 for a 60 K file.

•After 60K, the speedup gradually decreases but P9ZC remains roughly twice as fast as P9

P9

Apache Bench (ab) with 1 process, 100 concurrent requests, and a 60 second run, measured after sufficient warm-up

SUT (3GB, 3.4GHz Xeon, Fedora Core7), Prime Client (3.4 Ghz Xeon), P9 as of 2008/08/08, Zend (PHP 5.2.5) with APC enabled,

Throughput

Page 21: IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko

IBM Research

© 2009 IBM Corporation21

Micro-benchmark: CPU usage for the memcpy function.

CPU Usage of memcpy

0

10

20

30

40

50

60

70

File Size (KB)

CP

U U

sage

of m

emcp

y (%

)

P9

P9FS

P9 22.8 33.6 46 50.4 55.5 55.7 59.7 60.4 61.5 62.5

P9FS 4.32 4.12 3.93 3.32 2.9 2.8 2.64 2.43 2.56 2.46

20 40 60 80 100 120 140 160 180 200

P9ZC

P9ZC

CPU Usage of memcpy

0

10

20

30

40

50

60

70

File Size (KB)

CP

U U

sage

of m

emcp

y (%

)

P9

P9FS

P9 22.8 33.6 46 50.4 55.5 55.7 59.7 60.4 61.5 62.5

P9FS 4.32 4.12 3.93 3.32 2.9 2.8 2.64 2.43 2.56 2.46

20 40 60 80 100 120 140 160 180 200

P9ZC

P9ZC

File Size large

P9ZC significantly reduces CPU usage for the memcpy function while P9 needs uses approximately from 20% to 60% on them of CPU time to perform memory copying

Lower is better

% of memcpy

Page 22: IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko

IBM Research

© 2009 IBM Corporation22

Performance Evaluation with SPECweb2005

Linux (kernel 2.6.17)

PHPProcess

BESIM(databaseSimulator)

HTTP server

Application Server Backend (Business Logic / DB) Simulator

6 Clients

Named pipe / TCPsocket

tcp/ipsocket

Client(Emulator)

FastCGIProtocol

HTTPApache

HTTP server(FastCGI)

Linux 2.6.18 2GB RAM, Xeon 2.4GH, 2CPU

Fedora Core 7, kernel 2.6.17, Pentium 4 3.4GHz, 2GB RAM)

Linux 2.6.17, Xeon 2.4GHz, 1GB)

Client(Emulator)Client(s)

6 clients

(Linux 2.6.2, Xeon 3.0GHz, 3GB)

Lighttpd1.4.19

mod_fcgi

8 processes

Page 23: IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko

IBM Research

© 2009 IBM Corporation23

SPECweb2005 Banking

Comparison by Good+Tolerable Requests

0

5000

10000

15000

20000

25000

200 400 600 800 1000 1200 1400 1600 1800 2000 2200 2400 2600 2800 3000

sessions

# of

req

uest

s

Zend

P9

P9 ZC

No performance improvement is observed due to the fact that sendfile is ineffective with SSL communication and

SUT (3GB, 3.4GHz Xeon, Fedora Core7), 3 Clients and Prime Client (3.4 Ghz Xeon), BESIM: Apache 2.2.26, IBM J2RE 1.5.0 Linux build, P9 as of 2008/08/08, Zend (PHP 5.2.5) with APC enabled, SPECweb 3 minute run

Average Data Transfer Size: 34.8 KB

Higher is better

Throughput

Page 24: IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko

IBM Research

© 2009 IBM Corporation24

SPECweb2005 Ecommerce

Comparison by Good Requests (Ecommerce)

0

5000

10000

15000

20000

25000

10020

030

040

050

060

070

080

090

010

0011

0012

0013

0014

0015

0016

0017

0018

0019

0020

0021

0022

0023

0024

0025

0026

0027

0028

0029

0030

00

sessions

# o

f re

quest

s

Zend

P9

P9 ZC

P9 Level 1

SUT (3GB, 3.4GHz Xeon, Fedora Core7), 3 Clients and Prime Client (3.4 Ghz Xeon), BESIM: Apache 2.2.26, IBM J2RE 1.5.0 Linux build, P9 as of 2008/08/08, Zend (PHP 5.2.5) with APC enabled, SPECweb 3 minute run

Our approach outperforms original P9 by 22%, and Zend by 57%

Average Data Transfer Size: 143.9 KB

Higher is better

Throughput

Sessions

Page 25: IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko

IBM Research

© 2009 IBM Corporation25

SPECweb2005 Support

Comparison by Good+Tolerable Requests in Support

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

10020

030

040

050

060

070

080

090

010

0011

0012

0013

0014

0015

0016

0017

0018

0019

0020

0021

0022

0023

0024

0025

00

sessions

# of

req

uest

s

Zend

P9

P9ZC

SUT (3GB, 3.4GHz Xeon, Fedora Core7), 3 Clients and Prime Client (3.4 Ghz Xeon), BESIM: Apache 2.2.26, IBM J2RE 1.5.0 Linux build, P9 as of 2008/08/08, Zend (PHP 5.2.5) with APC enabled, SPECweb 3 minute run

Our approach outperforms original P9 by 31%, and Zend by 61%

Average Data Transfer Size: 78.5 KB

Higher is better

Throughput

Sessions

Page 26: IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko

IBM Research

© 2009 IBM Corporation26

CPU Usage of memcpy in 3 scenarios

CPU Usage of memcpy

3.13

9.04

12.48

12.6

2.1931

7.22

15.64

12.43

1.5534

5.05

3.43

4.42

0 2 4 6 8 10 12 14 16 18

Banking

Banking (No SSL)

Ecommerce

Support

Percentage when each runtime reaches up to the peak throughput

P9FS

P9

Zend

% CPU used for memcpy

P9ZC

P9

Zend

P9ZC dramatically decreases CPU time used for memory copying

Support : 78.5 KB Ecommerce: 143.9 KB Banking : 34.8 KB

SUT (3GB, 3.4GHz Xeon, Fedora Core7), 3 Clients and Prime Client (3.4 Ghz Xeon), BESIM: Apache 2.2.26, IBM J2RE 1.5.0 Linux build, P9 as of 2008/08/08, Zend (PHP 5.2.5) with APC enabled, SPECweb 3 minute run

Page 27: IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko

IBM Research

© 2009 IBM Corporation27

Outline of this talk

1. Background & Motivation

2. Our Solution: Zero Copy Data Transfer

3. Performance Evaluation

4. Related Work, Future Direction, and Conclusions

Page 28: IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko

IBM Research

© 2009 IBM Corporation28

Related Work

Zero Copy Approach – Evaluation of sendfile [Nahum, TON ‘02] – Faster FastCGI in Ligthttpd 1.5 beta [‘08]

• Implemented as P9 Level 1

SSL-enabled sendfile [Keromytis, TOCS ‘06]– Could allow our approach to be used for SPECweb

Banking Scenario

In kernel Web Servers [Armol, TOCS ‘04] [King, USENIX ‘01]–Their focus is on sending only static files

Page 29: IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko

IBM Research

© 2009 IBM Corporation29

Future Directions

Apply our approach to other programming languages such as Java, Ruby, and Python

Performance evaluation with more applications such as SugarCRM, MediaWiki, and phpBB

Extend the proposed approach to general cases where file processing is not explicitly required– Constant and long character sequences can be

dynamically and/or statically stored in a flat file – A web server sends the file to web clients via the

sendfile system call – Challenge: runtime overhead

Page 30: IBM Research © 2009 IBM Corporation Highly Scalable Web Applications with Zero-Copy Data Transfer Toyotaro Suzumura, Michiaki Tatsubori, Scott Trent, Akihiko

IBM Research

© 2009 IBM Corporation30

Conclusions

Proposed a novel approach that improves Web applications performance by a zero-copy approach

Showed promising performance improvement over our original PHP runtime with SPECweb2005 126% performance improvement with micro-

benchmarks 31% performance improvement for standard Web

benchmark, SPECweb2005