ruby http clients comparison

Post on 12-Jun-2015

12.951 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Ruby HTTP clients comparison

Hiroshi Nakamura nahi at Twitter, github

Technical Architect at Appirio Japan

CRuby and JRuby committer Asakusa.rb: http://qwik.jp/asakusarb/

Ruby HTTP Clients Matrix

advantages-and-disadvantages comparison of HTTP client libraries

http://bit.ly/RubyHTTPClients2012

Disclaimer: I'm the author of "httpclient"

Agenda

net/http

16 libraries I picked

Ruby HTTP Clients Matrix

API style

Compatibility

Supported features

Performance Comparisons

My Recommendations

net/http

Net::HTTP::Proxy? net/https?

Ruby HTTP Client libraries

HTTP client libraries I didn’t evaluate

Cannot evaluate • activeresource (Rails specific) • http (under development) • http_request.rb (test doesn't pass) • nestful (no test) • typhoeus (under heavy rewrite)

Obsolete • eventmachine (built-in client is obsolete) • right_http_connection (no update) • simplehttp (no update) • rfuzz (no update)

Evaluation Axis

Project Stats

API style

Compatibility: CRuby, JRuby, Rubinius

Supported features

Connection features

Basic HTTP features

Development support

Advanced features

http://bit.ly/RubyHTTPClientsFeatureTest (test/unit scripts)

Project Stats

Project Stats

API style

sync API (1/5) - Client instance

net/http, mechanize, httpclient, patron, curb, faraday

client = Net::HTTP.new(host, port) p client.get(path).body client = Mechanize.new client = HTTPClient.new client = Patron::Session.new p client.get(url).body curl = Curl::Easy.new(url) curl.http_get p curl.body_str client = Faraday.new(:url => baseurl) p client.get(path).body

sync API (2/5) - Client class

restfulie, excon, httpi

p Restfulie.at(url).get!.body p Excon.get(url).body p HTTPI.get(url).body

sync API (3/5) - Resource

rest-client, rufus-verbs

rs = RestClient::Resource.new('http://example.com') rs['posts/1/comments'].post 'Good article.', :content_type => 'text/plain' ep = EndPoint.new( :host => "resta.farian.host", :port => 80, :resource => "inventory/tools") res = ep.get :id => 1 res = ep.get :id => 2

sync API (4/5) - Include & Customize

httparty, weary

client = Class.new { include HTTParty } p client.get(url) class WearyClient < Weary::Client domain 'http://api.target.org/' get :retrieve, '{path}' end p WearyClient.new.fetch(:path => path).perform.body

sync API (5/5) - Others

open-uri.rb, wrest

# open-uri.rb p open(url) { |f| f.read } # wrest p 'http://www.google.co.jp/'.to_uri.get.body

async API(1/2) - Callback

em-http-request

body = nil EM.run do req = EM::HttpRequest.new( 'http://www.google.com/').get req.callback do body = req.response EM.stop end req.errback do body = nil end end p body

async API(2/2) - Polling

httpclient, weary, wrest

client = HTTPClient.new conn = client.get_async(url) conn.finished? # => false # ... io = conn.pop.content while str = io.read(4096) p str end

parallel API - curb

responses = [] m = Curl::Multi.new urls.each do |url| responses[url] = '' m.add(Curl::Easy.new(url) { |curl| curl.on_body { |data| responses[url] << data data.bytesize } }) end m.perform p responses.map { |e| e.bytesize }

API style

Compatibility

Compatibility

Connection features

3 types of HTTP connections

Keep-Alive in em-http-request body = [] EM.run do conn = EventMachine::HttpRequest.new(server.url) req1 = conn.get(:keepalive => true) req1.callback { body << req1.response req2 = conn.get(:keepalive => true) req2.callback { body << req2.response req3 = conn.get(:keepalive => true) req3.callback { body << req3.response req4 = conn.get(:keepalive => true) req4.callback { body << req4.response EM.stop req4.errback { ... }} req3.errback { ... }} req2.errback { ... }} req1.errback { ... } end

Pipelining in em-http-request

body = [] EM.run do conn = EventMachine::HttpRequest.new(server.url) req1 = conn.get(:keepalive => true) req2 = conn.get(:keepalive => true) req3 = conn.get(:keepalive => true) req4 = conn.get() req1.callback { body << req1.response } req2.callback { body << req2.response } req3.callback { body << req3.response } req4.callback { body << req4.response; EM.stop } req1.errback { ... } req2.errback { ... } req3.errback { ... } req4.errback { ... } end

NO verification by default?!

options[:ssl_ca_file] == nil => VERIFY_NONE

if http.use_ssl? http.verify_mode = OpenSSL::SSL::VERIFY_NONE if options[:ssl_ca_file] http.ca_file = options[:ssl_ca_file] http.verify_mode = OpenSSL::SSL::VERIFY_PEER end end

Connection features

Basic HTTP features

IRI: Internationalized Resource Identifier

uri.rb doesn't support IRI

addressable/uri does

client.get("http://www.ebooks.com/797059/some-kind-of-peace/grebe-camilla-träff-åsa-norlen-paul/")

"Cross-site Cooking" bug in httpclient

httpclient eats this cookie and send it to all *.com site

Mechanize handles it properly like browsers

http://en.wikipedia.org/wiki/Cross-site_cooking

Set-Cookie: TEST=test; path=/; domain=.com

Streaming upload/download

# Chunked upload with Patron client = Patron::Session.new res = client.request(:post, url, {}, :file => path_to_upload) # Chunked download client.get_file(url, path_to_write) # Chunked upload with em-http-request req = EM::HttpRequest.new(url).post :file => path_to_upload # Chunked download req = EM::HttpRequest.new(url).get req.stream do |chunk| p chunk end

Basic HTTP features

Development Support

Response stubbing

# Stubbing response body client = HTTPClient.new client.test_loopback_response << 'Hello!' client.get('http://www.example.com/hello').body #=> "Hello!" # Stubbing HTTP response client.test_loopback_http_response << "HTTP/1.0 302 Found¥r¥nLocation: http://foo/¥r¥n¥r¥n" << "HTTP/1.0 200 OK¥r¥n¥r¥nHello!" client.post('http://www.example.com/todo', :follow_redirect => true, :body => '{}').body #=> "Hello!"

IRB like shell

rest-client, httpclient, wrest

% restclient https://example.com user pass >> delete '/private/resource' % httpclient >> get "https://www.google.com", :q => :ruby % wrest >> 'http://www.google.com?q=ruby'.to_uri.get

Replayable log

rest-client

% RESTCLIENT_LOG=/tmp/restclient.log restclient >> RestClient.get "https://www.google.com/" ... % cat /tmp/restclient.log RestClient.get "https://www.google.com/", "Accept"=>"*/*; q=0.5, application/xml", "Accept-Encoding"=>"gzip, deflate" # => 200 OK | text/html 13354 bytes

Development Support

Advanced features

HTML form handling of Mechanize

agent = Mechanize.new page = agent.get(url) form = page.form('login') form.email = 'nahi@ruby-lang.org' form.password = 'jKH.P945wruV*qh3' page = agent.submit(form, form.button('submit'))

Advanced features

Testing your client

webmock by Bartosz Blimke (bblimke)

Library for stubbing and setting expectations on HTTP requests in Ruby.

vcr by Myron Marston (myronmarston)

Record your test suite's HTTP interactions and replay them during future test runs for fast, deterministic, accurate tests.

Performance Comparisons

Server Linode Xen VPS (Linode 512) at Fremont, CA Ubuntu 10.10 Apache 2.2, KeepAlive On

Client AWS EC2 (m1.small) at North Virginia (us-east-1b) Ubuntu 12.04 HTTP clients w/ CRuby 1.9.3p286

Multiple downloads of 177B.html and 24MB.zip

Don't take it serious! http://bit.ly/RubyHTTPClientsBenchmarkScript

Multiple 177B downloads

[sec]

[sec]

[sec]

My Recommendations

• Speed is the king => em-http-request, curb w/ multi

• HTML operation, Cookies => Mechanize

• API client => Faraday and adapter based impls

• SSL, Connectivity => httpclient

Check the matrix before you use the libraries

Please let me know when you find incorrect cell

Development Timeline

top related