distributed applications with perl & gearman
DESCRIPTION
Gearman is a framework for writing distributed applications across many nodes. It allows you to do work in parallel, load balance processes and write applications across several programming languages. In this presentation we'll learn how to get started writing Gearman-powered applications in Perl.TRANSCRIPT
Distributed Applications With Perl & GearmanIssac [email protected]
ABOUT THE PRESENTOR
http://linkedin.com/in/margol
60 seconds before we get started…
Why Distributed? Allow horizontal scaling of compute nodes
Normal resources (CPU, RAM, Disk) Other resources (specialized HW, SW)
True asynchronous worker processing Cross-Language support Redundant application availability
Gearman Architecture
Server (Cluster) Client
Server
Server
Worker
Worker
Worker
Client
Worker Pool
Example 1 – Video Processing User uploads a video Server processes the video
Server must transcode the video to several pre-set resolutions/codecs
Server must extract sample still images Server must run speech-to-text recognition
to extract subtitles (closed captions) Once all of that is completed, server must
update the video metadata to contain all of the newly available data and metadata
Example 2 – Map/Reduce Style Big Data User searches for information
Server must search catalog 1 Server must search catalog 2…
Server must search catalog n Server must return combined
search results to user
Example 3 – AntiVirus Scanner User uploads a file
Server must scan with McAfee Server must scan with Norton
360 Server must scan with Sophos…
Server must scan with Engine XYZ
Server returns scan results to user
Gearman Performance Stats Collected in October 2013 (https://groups.google.com/forum/#!topic/gearman/ror1rd6EGX0)
DealNews – 40 foreground tasks/sec (2 datacenters, 4 servers, 350 workers)
Etsy – 1000 tasks/sec (2 datacenters, 4 servers, 40 workers)
Shazam – 10K tasks/sec
From my own experience – 30 tasks/sec (1 datacenter, 1 server, 240 workers)
Gearman in Perl Gearman / Gearman::Server (PP) Gearman::XS (“official” libgearman) AnyEvent::Gearman and AnyEvent::Gearman::Client
(not the same) They aren’t 100% up-to-date They aren’t 100% feature-compatible The first two are both (individually) good for 90% of
use cases (in my personal experience )
Gearman in other languages C/C++ - official libgearman+ CLI PHP – GearmanManager – well maintained framework,
PHP_Gearman (PECL), Net_Gearman (Pure PHP) Node.JS .NET/C# JAVA Python UDFs – MySQL, PostgreSQL, Drizzle Write your own to implement the Gearman binary protocol
Creating a workeruse strict;use warnings;use Gearman::Worker; my $worker = Gearman::Worker->new;$worker->job_servers('127.0.0.1:4730');$worker->register_function('say_hello', \&hello);$worker->work ; # will never return sub hello { my $arg = $_[0]->arg; return "Hello, $arg\n";}
Testing the worker[issac@localhost ~]$ gearman -f say_hello 'world'Hello, world[issac@localhost ~]$
Writing a clientuse strict;use warnings;use Gearman::Client; my $client = Gearman::Client->new;$client->job_servers('127.0.0.1:4730');print ${$client->do_task(say_hello => 'world')};# do_task returns a reference to the response
Writing an asynchronous clientuse strict;use warnings;use Gearman::Client; my $client = Gearman::Client->new;$client->job_servers('127.0.0.1:4730');my $tasks = $client->new_task_set;$tasks->add_task(say_hello => 'world', { on_complete => \&done});$tasks->wait; sub done { print ${$_[0]};}
Worker response (packet) types SUCCESS FAIL STATUS DATA EXCEPTION* WARNING*
A more sophisticated worker 1/3use strict;use warnings;use Gearman::XS qw(:constants);use Gearman::XS::Worker; my $worker = new Gearman::XS::Worker;my $ret = $worker->add_server('127.0.0.1');if ($ret != GEARMAN_SUCCESS) { printf(STDERR "%s\n", $worker->error()); exit(1);}$worker->add_options(GEARMAN_WORKER_NON_BLOCKING);$worker->set_timeout(500);
A more sophisticated worker 2/3$ret = $worker->add_function("hello", 3, \&hello, {});$ret = $worker->add_function("uhoh", 3, \&uhoh, {});if ($ret != GEARMAN_SUCCESS) { printf(STDERR "%s\n", $worker->error());}my $go = 1;$SIG{TERM} = sub {print "Caught SIGTERM\n";$go = 0;};while ($go) { my $ret = $worker->work(); # will return after 500ms since we set timeout + non-blocking mode above if (!($ret == GEARMAN_SUCCESS || $ret == GEARMAN_IO_WAIT || $ret == GEARMAN_NO_JOBS)) { printf(STDERR "%s\n", $worker->error()); } $worker->wait();}
A more sophisticated worker 3/3sub hello { my $job = shift; $job->send_status(1,2); my $string = "Hello, “.$job->workload()."\n"; $job->send_status(2,2); return $string;}sub uhoh{ my $job = shift; $job->send_warning("uh oh"); $job->send_data($job->workload() . "\n"); $job->send_fail();}
Testing the (slightly) sophisticated worker[issac@localhost ~]$ gearman -f hello 'world'50% Complete100% CompleteHello, world[issac@localhost ~]$ gearman -f uhoh 'world'uh ohworldJob failed[issac@localhost ~]$
A more sophisticated client 1/3use strict;use warnings;use Gearman::XS qw(:constants);use Gearman::XS::Client;
my $client = Gearman::XS::Client->new;
my $task;my $ret = $client->add_server('127.0.0.1');if ($ret != GEARMAN_SUCCESS) { printf(STDERR "%s\n", $client->error()); exit(1);}
A more sophisticated client 2/3$client->set_complete_fn(sub { my $task = shift; print "COMPLETE: " . $task->data() . "\n"; return GEARMAN_SUCCESS;});
$client->set_data_fn(sub { print "DATA: " . $_[0]->data() . "\n"; return GEARMAN_SUCCESS;});
$client->set_warning_fn(sub { print "WARNING: " . $_[0]->data() . "\n"; return GEARMAN_SUCCESS;});
A more sophisticated client 3/3$client->set_fail_fn(sub { print "FAIL: " . $_[0]->function_name() . "\n"; return GEARMAN_SUCCESS;});
$client->set_status_fn(sub { print "STATUS: " . $_[0]->numerator() . "/" . $_[0]->denominator() . "\n"; return GEARMAN_SUCCESS;});
($ret, $task) = $client->add_task("hello", "world");($ret, $task) = $client->add_task("uhoh", "it broke");$ret = $client->run_tasks();
A more sophisticated client (output)[issac@localhost ~]$ perl asclient.plSTATUS: 1/2STATUS: 2/2COMPLETE: Hello, world
WARNING: uh ohDATA: it broke
FAIL: uhoh[issac@localhost ~]$
That’s All, Folks!
Issac [email protected]
http://www.itsgoodconsulting.com/blog/issac-presenting-distributed-apps-with-gearman-at-telaviv-pm/
Link To Slides