0-60 with goliath: high performance web services
DESCRIPTION
TRANSCRIPT
0-60 with Goliath @igrigorik
0-60 with Goliath
Ilya Grigorik@igrigorik
Building high performance (Ruby) web-services
0-60 with Goliath @igrigorik
- “Social Analytics”- Rails frontend- Ruby backend- 95%+ of traffic via API’s
Brief HistoryGoliath @ PostRank
+ +
- Goliath == v3 API stack- Open sourced in 2011- Growing community
0-60 with Goliath @igrigorik
Rails
HTTP API
SQL
SQLSQLSQL
HTTP API
SQLSQLSolr
HTTP API
…
HTTP API …
• Separate logical & physical services• Easy to tune, easy to maintain, easy to “scale”• Stable code, fault-tolerance
• Higher upfront ops cost• Lots of cross-service communication
PRO
CON
0-60 with Goliath @igrigorik
0-60 with Goliath @igrigorik
• Single responsibility web-services
• Async HTTP response streaming + progressive notifications• Async HTTP request streaming + progressive notifications
• Multiple requests within the same VM
• Keep-alive support• Pipelining support
• Ruby API & “X-Ruby friendly”• Easy to maintain & test
… full HTTP 1.1 support
www.goliath.io
… lower ops costs
… Ruby polyglot!
0-60 with Goliath @igrigorik
HTTP Pipelining + Keep-Alive 101perhaps not as simple as it may seem…
0-60 with Goliath @igrigorik
HTTP Quizthis is not a trick question…
conn = EM::HttpRequest.new('http://oredev.org')
r1 = conn.get :path => ”2011/speakers", :keepalive => true # 250 msr2 = conn.get :path => ”2011/faq" # 300 ms
# wait until done …
Total execution time is:
(a) 250 ms(b) 300 ms(c) 550 ms
~ 65% truthiness~ 25% truthiness *~ 10% truthiness **
Answer:
All of the above!
0-60 with Goliath @igrigorik
Client
HTTP 1.0RFC 1945 (1996)
Server
20 ms TCP handshake
HTTP Request• headers, body
Multi-part body (*)
Terminate connection
40 msprocessing
+ 40ms TCP setup (network)+ 20ms request (network) + 40ms processing + 20ms response (network)
66% of time in network overhead
0-60 with Goliath @igrigorik
Research done at Google shows that an increase from 5Mbps to 10Mbps results in a disappointing 5% improvement in page load times.
Or put slightly differently, a 10Mbps connection, on average uses only 16% of its capacity.
http://bit.ly/oemX0I
0-60 with Goliath @igrigorik
HTTP 1.1RFC 2616 (1999)
Keep-alive• Re-use open connection• No multiplexing, serial• Default to “on”
Pipelining• No multiplexing• Parallel requests
0-60 with Goliath @igrigorik
Keep-aliveRFC 2616 (1999)
+ 40ms TCP setup (network)
+ 20ms request (network) + 40ms processing + 20ms response (network)
x 40ms TCP setup (network)
+ 20ms request (network) + 40ms processing + 20ms response (network)
200ms for two requests
Small win over HTTP 1.0
0-60 with Goliath @igrigorik
Keep-aliveRFC 2616 (1999)
Connection: close < ugh!
Net:HTTP
0-60 with Goliath @igrigorik
PipeliningRFC 2616 (1999)
+ 40ms TCP setup (network)
+ 20ms request (network) + 40ms processing + 20ms response (network)
60% of time in network overhead
120ms for two requests – 50% improvement!
0-60 with Goliath @igrigorik
Pipelining Quiz RFC 2616 (1999)
Connection setup: 50ms
Request 1: 300msRequest 2: 250ms
Total time:
(a) ~250 ms(b) ~300 ms(c) ~350 ms(d) ~600 ms
0-60 with Goliath @igrigorik
Benchmark client round-trip time (RTT), not just the server processing time
* a public service announcement
0-60 with Goliath @igrigorik
Making HTTP Pipelining Usable on the Open Web
http://tools.ietf.org/html/draft-nottingham-http-pipeline-01
There is just one small gotcha…
0-60 with Goliath @igrigorik
HTTP in the wildit’s a sad state of affairs
conn = EM::HttpRequest.new('http://gogaruco.com')
r1 = conn.get :path => "speakers.html", :keepalive => true # 250 msr2 = conn.get :path => "schedule.html" # 300 ms
Total execution time is:
(a) 250 ms(b) 300 ms(c) 550 ms
~ 65% truthiness~ 25% truthiness *~ 10% truthiness **
Keep-alive what? HTTP 1.0!
Good: Keep-alive + PipeliningBad: Keep-alive + Garbage
“I’m confused”
Keep-alive: mostly works – yay!Pipelining: disabled (except in Opera)
0-60 with Goliath @igrigorik
HTTP can be a high-performance transportGoliath is our attempt to make it work
0-60 with Goliath @igrigorik
Network
EventMachine
(streaming) HTTP Parser
Middleware Routing
(optional) Fibers
Client API
Ruby, JRuby, Rubinius …
async-rack
“Sync API”
GoliathOptimize bottom up + minimal client API
optional async
HTTP 1.1 0.3 ms
0-60 with Goliath @igrigorik
EventMachine Reactorconcurrency without thread
while true do timers
network_ioother_io
end
p "Starting"
EM.run do p "Running in EM reactor"end
p ”won’t get here"
EventMachine
0-60 with Goliath @igrigorik
Non-blocking IO requires non-blocking drivers:
AMQP http://github.com/tmm1/amqpMySQLPlus http://github.com/igrigorik/em-mysqlplus Memcached http://github.com/astro/remcached DNS http://github.com/astro/em-dns Redis http://github.com/madsimian/em-redis MongoDB http://github.com/tmm1/rmongo HTTPRequest http://github.com/igrigorik/em-http-request WebSocket http://github.com/igrigorik/em-websocket Amazon S3 http://github.com/peritor/happening
And many others: http://wiki.github.com/eventmachine/eventmachine/protocol-implementations
EventMachine
0-60 with Goliath @igrigorik
Goliath: Hands on…
0-60 with Goliath @igrigorik
class AsyncUpload < Goliath::API
def on_headers(env, headers) env.logger.info 'received headers: ' + headers end
def on_body(env, data) env.logger.info 'received data chunk: ' + data end
def on_close(env) env.logger.info 'closing connection' end
def response(env) # called when request processing is complete end end Async Request Processing
don’t need to wait for the full request…
(streaming) HTTP Parser
0-60 with Goliath @igrigorik
class Stream < Goliath::API
def response(env) pt = EM.add_periodic_timer(1) { env.stream_send("hello") } EM.add_timer(10) do pt.cancel
env.stream_send("goodbye!") env.stream_close end
streaming_response 202, {'X-Stream' => 'Goliath’} end end
Async/Streaming Responsedon’t need to render full response…
(streaming) HTTP Parser
0-60 with Goliath @igrigorik
class Websocket < Goliath::WebSocket def on_open(env) env.logger.info ”WebSocket opened” end
def on_message(env, msg) env.logger.info ”WebSocket message: #{msg}” end
def on_close(env) env.logger.info ”WebSocket closed” end
def on_error(env, error) env.logger.error error end end
Web-Socketssimple backend extension
(streaming) HTTP Parser
0-60 with Goliath @igrigorik
class Hello < Goliath::API
use Goliath::Rack::Params use Goliath::Rack::JSONP use Goliath::Rack::Validation::RequestMethod, %w(GET) use Goliath::Rack::Validation::RequiredParam, {:key => 'echo'} def response(env) [200, {}, {pong: params['echo’]}] end end
MiddlewareNo rackup file
Middleware Routing
0-60 with Goliath @igrigorik
class Bonjour < Goliath::API def response(env) [200, {}, "bonjour!"] endend
class RackRoutes < Goliath::API
map '/version' do run Proc.new { |env| [200, {}, ["Version 0.1"]] } end
get "/bonjour", Bonjour
not_found('/') do # run Proc. new { ... } end
end
Routingsimple and powerful
Middleware Routing
0-60 with Goliath @igrigorik
Network
EventMachine
(streaming) HTTP Parser
Middleware Routing
(optional) Fibers
Client API
Ruby Fibers + Goliathsynchronous API for asynchronous processing
0-60 with Goliath @igrigorik
Ruby 1.9 Fibersand cooperative scheduling
Ruby 1.9 Fibers are a means of creating code blocks which can be paused and resumed by our application (think lightweight threads, minus the thread scheduler and less overhead).
f = Fiber.new { while true do Fiber.yield "Hi” end}
p f.resume # => Hip f.resume # => Hip f.resume # => Hi
Manual / cooperative scheduling!
http://bit.ly/d2hYw0
0-60 with Goliath @igrigorik
Ruby 1.9 Fibersand cooperative scheduling
http://bit.ly/aesXy5
Fibers vs Threads: creation time much lower
Fibers vs Threads: memory usage is much lower
0-60 with Goliath @igrigorik
Untangling Evented Code with Fibershttp://bit.ly/d2hYw0
def query(sql) f = Fiber.current conn = EventMachine::MySQL.new(:host => 'localhost') q = conn.query(sql) c.callback { f.resume(conn) } c.errback { f.resume(conn) } return Fiber.yieldend EventMachine.run do Fiber.new { res = query('select sleep(1)') puts "Results: #{res.fetch_row.first}" }.resumeend
Exception, async!
0-60 with Goliath @igrigorik
Untangling Evented Code with Fibershttp://bit.ly/d2hYw0
def query(sql) f = Fiber.current conn = EventMachine::MySQL.new(:host => 'localhost') q = conn.query(sql) c.callback { f.resume(conn) } c.errback { f.resume(conn) } return Fiber.yieldend EventMachine.run do Fiber.new { res = query('select sleep(1)') puts "Results: #{res.fetch_row.first}" }.resumeend
1. Wrap into a continuation
0-60 with Goliath @igrigorik
Untangling Evented Code with Fibershttp://bit.ly/d2hYw0
def query(sql) f = Fiber.current conn = EventMachine::MySQL.new(:host => 'localhost') q = conn.query(sql) c.callback { f.resume(conn) } c.errback { f.resume(conn) } return Fiber.yieldend EventMachine.run do Fiber.new { res = query('select sleep(1)') puts "Results: #{res.fetch_row.first}" }.resumeend
2. Pause the continuation
0-60 with Goliath @igrigorik
Untangling Evented Code with Fibershttp://bit.ly/d2hYw0
def query(sql) f = Fiber.current conn = EventMachine::MySQL.new(:host => 'localhost') q = conn.query(sql) c.callback { f.resume(conn) } c.errback { f.resume(conn) } return Fiber.yieldend EventMachine.run do Fiber.new { res = query('select sleep(1)') puts "Results: #{res.fetch_row.first}" }.resumeend
3. Resume the continuation
Fixed!
0-60 with Goliath @igrigorik
• Single responsibility web-services
• Async HTTP response streaming + progressive notifications• Async HTTP request streaming + progressive notifications
• Multiple requests within the same VM
• Keep-alive support• Pipelining support
• Ruby API & “X-Ruby friendly”• Easy to maintain & test
www.goliath.io
Goliath automatically wraps each incoming request into a Ruby fiber, allowing us to hide the async
complexity from the developer.
0-60 with Goliath @igrigorik
require 'goliath'
class Hello < Goliath::API
def response(env) [200, {}, "Hello World"] end
end
$> ruby hello.rb -sv –p 8000 –e production
Hello WorldSimple Goliath server
0-60 with Goliath @igrigorik
def response(env) conn = EM::HttpRequest.new(’http://google.com/') r1 = conn.aget :query => {:q => ’oredev'}
r1.callback do r2 = conn.aget :query => {:q => ’goliath'} r2.callback { ... } r2.errback { ... } end r2.errback { ... }
streaming_response 200, {} end Async fun
{ { {} } } …
async response
0-60 with Goliath @igrigorik
def response(env) conn = EM::HttpRequest.new(’http://google.com/') r1 = conn.get :query => {:q => ’oredev'} if r1.error? r2 = conn.get :query => {:q => ’goliath'} else # ... end
[200, {}, r2.response]end
Async + FibersRuby gives us a choice
async + fiber response
0-60 with Goliath @igrigorik
EM-Synchrony:
multi = EventMachine::Synchrony::Multi.new
multi.add :a, db.aquery("select sleep(1)")multi.add :b, db.aquery("select sleep(1)")
res = multi.perform
• em-http-request• em-memcached• em-mongo• em-jack• mysql2• redis• …
• ConnectionPool• MultiRequest• Iterators• Inline async• TCPSocket• …
http://github.com/igrigorik/em-synchrony
0-60 with Goliath @igrigorik
describe HttpLog do
it 'forwards to our API server' do with_api(HttpLog, api_options) do |api|
get_request({}, err) do |c| c.response_header.status.should == 200 c.response_header[’X-Header'].should == 'Header' c.response.should == 'Hello from Responder' end end end
end Integration Testing
simple end-to-end testing
0-60 with Goliath @igrigorik
0-60 with Goliath @igrigorik
Phew, time for questions?hope this convinced you to explore the area further…
Peepcodehttp://peepcode.com/products/eventmachine-ii http://peepcode.com/products/eventmachine-i
Goliathhttps://goliath.io/https://github.com/postrank-labs/goliath/
gem install goliath