gevent network library denis bilenko gevent.org. problem statement from urllib2 import urlopen...

39
gevent network library Denis Bilenko gevent.org

Upload: henry-prentiss

Post on 02-Apr-2015

223 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

gevent network library

Denis Bilenko

gevent.org

Page 2: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

Problem statement

from urllib2 import urlopenresponse = urlopen('http://gevent.org')body = response.read()

How to manage concurrent connections?

Page 3: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

Problem statement

def on_response_read(response):     d = response.read()    d.addCallbacks(on_body_read, on_error) def on_error(error):     ...

def on_body_read(body):    ... d = readURL('http://gevent.org').d.addCallbacks(on_response_read, on_error)reactor.run()

Possible answer: Async framework (Twisted, asyncore, ...)

Page 4: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

simplicity is lost

Page 5: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

Problem statement

from threading import Threaddef read_url(url):    response = urllib2.urlopen(url)     body = response.read()

t1=Thread(target=read_url, args=('http://gevent.org',))t1.start() t2=Thread(target=read_url, args=('http://python.org',))t2.start()t1.join()t2.join()

Possible answer: Threads

Page 6: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

resource hog

Page 7: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

Memory required for 10k connections

twisted55 MB

threading400 MB

Memory required for 10k connections

Page 8: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

gevent (greenlet + libevent)

from gevent import monkey; monkey.patch_all()  def read_url(url):  response = urllib2.urlopen(url)  body = response.read() a = gevent.spawn(read_url, 'http://gevent.org')b = gevent.spawn(read_url, 'http://python.org')

gevent.joinall([a, b])

Page 9: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

concurrent fetch

Page 10: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

Memory required for 10k connections

twisted55 MB

gevent70 MB

threading400 MB

Memory required for 10k connections

Page 11: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

greenlet

Page 12: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

from greenlet import greenlet

>>> def myfunction(arg):...     return arg + 1

>>> g = greenlet(myfunction)>>> g.switch(2)3

Page 13: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

from greenlet import greenlet

>>> MAIN = greenlet.getcurrent()>>> def myfunction(arg):...     MAIN.switch('hello')...     return arg + 1

>>> g = greenlet(myfunction)>>> g.switch(2)'hello'>>> g.switch('hello to you')3

Page 14: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

switching deep down the stack

>>> def myfunction(arg):...     MAIN.switch('hello')...     return arg + 1 >>> def top_function(arg):...     return myfunction(arg)

>>> g = greenlet(top_function)   >>> g.switch(2)'hello'

Page 15: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

from greenlet import greenlet

• primitive pseudothreads, share same OS thread• switched explicitly via switch() and throw()• organized in a tree, each has .parent except MAIN• switch(), throw() and .parent reserved for gevent

http://codespeak.net/py/0.9.2/greenlet.html

Page 16: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

How gevent uses greenlet

HUB

MAIN

spawned greenlets

Page 17: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

Hub: greenlet that runs event loopfrom gevent import core

class Hub(greenlet.greenlet):

def run(self): core.dispatch() # wrapper for event_dispatch()

def get_hub(): # return the global Hub instance # creating one if does not exist

gevent/hub.py

Page 18: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

Event loop

• libevent 1.4.x or 2.0.5-beta• gevent.core: wraps libevent API (like pyevent)

>>> def print_hello():... print 'hello'>>> gevent.core.timer(1, print_hello)<timer ...>>>> gevent.core.dispatch()hello1 # return value (no more events)

Page 19: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

Implementation of gevent.sleep()def sleep(seconds=0): """Put the current greenlet to sleep""“ switch = getcurrent().switch timer = core.timer(seconds, switch) try: get_hub().switch() finally: timer.cancel()

Page 20: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

Cooperative socket

• gevent.socket: compatible synchronous interface• wraps a non-blocking socket

def recv(self, size): while True:   try:   return self._sock.recv(size)   except error, ex:    if ex[0] == EWOULDBLOCK: wait_read(self.fileno()) else: raise

Page 21: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

Cooperative socket

• gevent.socket: compatible synchronous interface• wraps a non-blocking socket

def wait_read(fileno): switch = getcurrent().switch event = core.read_event(fileno, switch)  try: get_hub().switch() finally: event.cancel()

gevent/socket.py

Page 22: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

Cooperative socket

• gevent.socket• dns queries are resolved through libevent-dns

(getaddrinfo, gethostbyname)• gevent.ssl

Page 23: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

Monkey patching

from gevent import monkey; monkey.patch_all()  def read_url(url):  response = urllib2.urlopen(url)  body = response.read() a = gevent.spawn(read_url, 'http://gevent.org')b = gevent.spawn(read_url, 'http://python.org')

gevent.joinall([a, b])

Page 24: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()
Page 25: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

Monkey patching

Patches:• socket and ssl modules• time.sleep, select.select• thread and threadingBeware:• libraries that wrap C libraries (e.g. MySQLdb)• Disk I/O• things not yet patched: subprocess, os.system, sys.stdinTested with httplib, urllib2, mechanize, mysql-connector,

SQLAlchemy, ...

Page 26: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()
Page 27: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

Greenlet objects

from gevent import monkey; monkey.patch_all()  def read_url(url):  response = urllib2.urlopen(url)  body = response.read() a = gevent.spawn(read_url, 'http://gevent.org')b = gevent.spawn(read_url, 'http://python.org')

gevent.joinall([a, b])

Page 28: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

Greenlet objects

def read_url(url):  response = urllib2.urlopen(url)  body = response.read() g = Greenlet(read_url, url)g.start()

# wait for it to completeg.join()

# or raise an exception and wait to exitg.kill()

= spawn

Page 29: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

Greenlet objects

def read_url(url):  response = urllib2.urlopen(url)  body = response.read() g = Greenlet(read_url, url)g.start()

# wait for it to complete (or timeout expires)g.join(timeout=2)

# or raise and wait to exit (or timeout expires)g.kill(timeout=2)

= spawn

Page 30: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

Timeouts

with gevent.Timeout(5):  response = urllib2.urlopen(url)  for line in response: print line# raises Timeout if not done after 5 seconds

with gevent.Timeout(5, False):  response = urllib2.urlopen(url)  for line in response: print line# exits block if not done after 5 seconds

Beware: catch-all “except:”, non-yielding code

Page 31: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

API

• socket, ssl • Greenlet• Timeout

• Event, AsyncResult• Queue (also JoinableQueue, PriorityQueue, LifoQueue)

– Queue(0) is a synchronous channel

• Pool

• StreamServer: TCP and SSL servers• WSGI servers

Page 32: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

WSGI servers

• gevent.wsgi– uses libevent-http– efficient, but lacks important features

• gevent.pywsgi– uses gevent sockets

• green unicorn (gunicorn.org)– its own parser or gevent’s server– pre-fork workers

Page 33: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()
Page 34: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()
Page 35: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

Caveat emptor

• Reduced portability– no Jython, IronPython– not all platforms supported by CPython

• PyThreadState is shared– exc_info (saved/restored by gevent)– tracing, profiling info

Page 36: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

Future plans

• http://code.google.com/p/gevent/issues/list• alternative coroutine libraries– Stackless– swapcontext

• more libevent:– http client– buffered socket operations– priorities

• process handling (gevent.subprocess)• even more stable API with 1.0

Page 37: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

Examples

• bitbucket.org/denis/gevent/src/tip/examples/• chat.gevent.org• omegle.com• ProjectsUsingGevent– gevent-mysql– psycopg2

• bit.ly/use-gevent– websockets, web crawlers, facebook apps

Page 38: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

Summary

• coroutines are easy-to-use threads• as efficient as async libraries• works well if app is I/O bound• simple API, many things familiar• works with unsuspecting 3rd party modules

Page 39: Gevent network library Denis Bilenko gevent.org. Problem statement from urllib2 import urlopen response = urlopen('') body = response.read()

Thank you!

gevent.org@gevent