gearman: a job server made for scale

Post on 17-May-2015

6.180 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

A Job Server to Scale

By Mike Willbanks

Software Engineering Manager

CaringBridge

MinneBar April 7, 2012

2

•Talk

Slides will be online later!

•Me

Software Engineering Manager at CaringBridge

MNPHP Organizer

Open Source Contributor (Zend Framework and various others)

Where you can find me:

• Twitter: mwillbanks G+: Mike Willbanks

• IRC (freenode): mwillbanks Blog: http://blog.digitalstruct.com

• GitHub: https://github.com/mwillbanks

Housekeeping…

3

• What is Gearman

Yeah yeah…

• Main Concepts

How it really works

• Quick Start

Get it up and running and start playing.

• The Details

How can it be a tech talk without details?

• Some use cases

How you might use it.

• Questions

Although you can bring them up at anytime!

Agenda

What is Gearman? Official Statement

What the hell it means

Visual understanding

Platforms

5

“Gearman provides a generic application framework to farm

out work to other machines or processes that are better

suited to do the work. It allows you to do work in parallel,

to load balance processing, and to call functions between

languages.”

Official Statement

6

•Gearman consists of a daemon, client and worker

At the core, they are simply small programs.

•The daemon handles the negotiation of work

Workers and Clients

•The worker does the work

•The client requests work to be done

What The Hell? Tell me!

7

In Pictures

8

•Gearman works on linux

•API implementations available

PHP

Perl

Java

Ruby

Python

Platforms

Main Concepts Client -> Daemon -> Worker communication

Distributed Model

10

Client -> Daemon -> Worker communication

11

Distributed Model

Quick Start Installation

Simple Bash Example

PHP Related (sorry, I’m all about the PHP)

13

•Head to gearman.org

•Click Download

•Click on the LaunchPad download

•Download the Binary

•Unpack the binary

• ./configure && make && make install

•Bam! You’re off!

For more advanced configuration see ./configure –help

•Starting

gearmand -d

Installation

14

•Starting the Daemon

gearmand –d

•Worker – command line style

gearman -w -f wc -- wc –l

•Client – command line style

gearman -f wc < /etc/passwd

•Check it!

Simple Bash Example

15

PHP Style

16

•So, you know… we all like to talk about ourselves…

Yes, I wrote a layer on top of Zend Framework called

Zend_Gearman; wow unique.

https://github.com/mwillbanks/Zend_Gearman

PHP – Zend Framework

The Details Persistence

Workers

Monitoring

18

•Gearman by default is an in-memory queue

Leaving this as the default is ideal; however, does not work in all

environments.

•Persistent Queues

Libdrizzle

Libsqlite3

Libmemcached

Postgres

TokyoCabinet

MySQL

Redis

Persistence

19

•Persistent queues require specific configuration during the

compilation of gearman.

•Additionally, arguments to the gearman daemon need to be

passed to talk to the specific persistence layer.

•Each persistence layer is actually built as a plugin to

gearmand

http://bazaar.launchpad.net/~tangent-

org/gearmand/trunk/files/head:/libgearman-

server/plugins/queue/

Getting Up and Running with Persistence

20

Configuration Options

21

•Clients send work to the gearmand server

This is called the workload; it can be anything that can become a

string.

Utilize an open format; it will make life easier if you chose to use

a different language for processing

• XML, JSON, etc.

• Yes, you can serialize objects if you wanted to… not recommended

although.

Clients

22

•Workers are the dudes in the factory doing all the work

•Generally they will run as a daemon in the background

•Workers register a function that they perform

They should ONLY be doing a single task.

This makes them far easier to manage.

•The worker does the work and “can” return results

If you are doing the work asynchronously you generally do not

return the result.

Synchronous work you will return the result.

Workers

23

•Utilizing the Database

If you keep a database connection

• Must have the ability to reconnect to the database.

• Watch for connection timeouts

•Handling Memory Leaks

Watch the amount of memory and detect leaks then kill the

worker.

•Request Languages

PHP for instance, sometimes slows down after hundreds of

executions, kill it off if you know this will happen.

Workers – special notes

24

•Workers sometimes have issues and die, or you need to boot

them back up after a restart

Utilizing a service to watch your workers and ensure they are

always running is a GOOD thing.

•Supervisord

Can watch processes, restart them if they die or get killed

Can manage multiple processes of the same program

Can start and stop your workers.

•When running workers, BE SURE to handle KILL signals such

as SIGKILL.

Keeping the Daemon Running

25

Supervisord Example

26

•Until recently you were writing something against the

gearman socket interface…

telnet on port 4730

Write “STATUS”

• Gives you the registered functions, number of workers and items in the

queue.

•Gearman Monitor – PHP Project

NOTE: I’ve never actually attempted this; BUT it is referenced on

gearman.org so it must be doing something!

https://github.com/yugene/Gearman-Monitor

Monitoring

Use Cases Email

Photos

Log Analysis / Aggregation

28

• If you resize images on your web server:

Web servers should serve, not process images.

Images require a lot of memory AND processing power

• They are best to be processed on their own!

•Processing in the Background

Generally will require a change to your workflow and checking the

status with XHR to see if the job has been completed.

• This allows you to process them as you have resources available.

• Have enough workers to process them “quickly enough”

Images

29

Image Processing Example

30

•Sending email and/or generating templates and processing

variables can take up time, time that is better spent getting

the user to the next page.

•The feedback on the mail doesn’t really make a difference

so it is great to send it to the background.

Email

31

Email Example

32

•Get all of your logs to a single place

•Process the logs to produce analytical data

• Impression / Click Tracking

•Why run a cron over your logs nightly?

Real-time data is where it is at!

Log Analysis / Aggregation

33

Log Analysis / Aggregation

Questions? These slides will be posted to SlideShare & SpeakerDeck.

Slideshare: http://www.slideshare.net/mwillbanks

SpeakerDeck: http://speakerdeck.com/u/mwillbanks

Twitter: mwillbanks

G+: Mike Willbanks

IRC (freenode): mwillbanks

Blog: http://blog.digitalstruct.com

GitHub: https://github.com/mwillbanks

top related