gearmam, from the_worker's_perspective copy
DESCRIPTION
TRANSCRIPT
GearmanFrom the Worker's Perspective
/usr/bin/whoami /usr/bin/whoami
Brian AkerBrian Aker
HP FellowHP Fellow
Previously MySQL, Slashdot, Sun Previously MySQL, Slashdot, Sun MicrosystemsMicrosystems
What is Gearman?
“The way I like to think of Gearman is a massively
distributed fork mechanism” -Joe Stump, Digg
“The Not Mechanical Turk” -Don MacAskill, SmugMug
do do (“resize_image(“resize_image”)”)
resize_image()resize_image() {{ … … return $image;return $image;}}
(Livejournal)
Server Provides Asynchronous and Synchronous Provides Asynchronous and Synchronous
RequestsRequests
Restarts WorkRestarts Work
Durable Requests (MySQL, Postgres,...)Durable Requests (MySQL, Postgres,...)
Gearman Protocol/HTTPGearman Protocol/HTTP
Epoch SchedulingEpoch Scheduling
LoggingLogging
(also available, native Servers in Java, (also available, native Servers in Java, Erlang, and Perl)Erlang, and Perl)
Client
# Create our client object.$gmclient= new GearmanClient();# Add default server (localhost).$gmclient->addServer();
$result= $gmclient->do("reverse", "Hello!");echo "Success: $result\n";
Worker
# Create our worker object.$gmw= new GearmanWorker();# Add default server (localhost).$gmw->addServer();
$gmw->addFunction("reverse", "reverse_fn");while ($gmworker->work()) {…}
Worker Function
function reverse_fn($job){ $workload= $job->workload(); $result= strrev($workload); return $result;}
Lots of functions...
$gmw->addFunction("resize", "resize_fn");$gmw->addFunction("grep", "grep_fn");
$gmw->addFunction("fetch_url", "fetch_url");
gearman_return_t fetch_url(gearman_job_st *job, void*){ const char *workload= gearman_job_workload(job); size_t workload_size= gearman_job_workload_size(job); gearman_job_send_status(job, 0, 100);
… gearman_job_send_data(job, chunk, sizeofchunk);
… gearman_job_send_status(job, 50,100);
… if (issue_warning) gearman_job_warning(job, “I'm sorry, Dave. I'm afraid I can't do that.”, size); return GEARMAN_SUCCESS;}
Function
GEARMAN_SUCCESSGEARMAN_SUCCESS
GEARMAN_FATALGEARMAN_FATAL
GEARMAN_ERRORGEARMAN_ERROR
GEARMAN_SHUTDOWNGEARMAN_SHUTDOWN
Worker Return
Client APIClient API(C, PHP, Perl, Python, (C, PHP, Perl, Python,
Java,Drizzle, ...)Java,Drizzle, ...)
Worker APIWorker API(C, PHP, Perl, Python, Java, ...)(C, PHP, Perl, Python, Java, ...)
ServerServer
Your Client Your Client request()request()
Your WorkerYour Workerfunction()function()
Network,Highly Available,Fault Tolerant
Client APIClient API(C, PHP, Perl, Python, (C, PHP, Perl, Python,
Java,Drizzle, ...)Java,Drizzle, ...)
Worker APIWorker API(C, PHP, Perl, Python, Java, ...)(C, PHP, Perl, Python, Java, ...)
ServerServer Provided by Provided by GearmanGearman
Your Client Your Client request()request()
Your WorkerYour Workerfunction()function()
resize_imageresize_image(“…”);(“…”);
resize_image()resize_image(){{ … …;; return resized;return resized;}}
Network,Highly Available,Fault Tolerant
CPU?event()
multiple languages
Connectors?
•C/C++
•PHP
•Python
•Java
•MySQL and Postgres
•....
Other items of interest?
•Work status requests.
•Chunked Data.
•Exception Handling.
•Up to 4gig message sizes.
•Threaded server.
•Coalescence (the stealth killer feature)
NamespacesFoo::resize_image()Foo::resize_image() {{ … … return $image;return $image;}}
Acme::resize_image()Acme::resize_image() {{ … … return $image;return $image;}}
Better Map Reduce?
map(list[…], map(list[…], reduce());reduce());
map() map() {…}{…}
reduce() reduce() {…}{…}
reduce() reduce() {…}{…}
reduce() reduce() {…}{…}
Map @#$@# ?
Partitioning
{A...K}
{L...Q}
{R...Z}
find(find())
find(find())
find(find())
gearman_return_t split_worker(gearman_job_st *job, void* /* context */){ const char *workload= gearman_job_workload(job); size_t workload_size= gearman_job_workload_size(job); const char *chunk_begin= workload; for (size_t x= 0; x < workload_size; x++) { if (workload[x] == 0 or workload[x] == ' ') { gearman_job_send_data(job, chunk_begin, workload +x -chunk_begin); chunk_begin= workload +x +1; } } return GEARMAN_SUCCESS;}
Partitioning
Aggregation
+ result
+ result
+ result$res$resultult
$res$resultult
$res$resultult
= sum result
Aggregation
gearman_return_t cat_aggregator (gearman_aggregator_st *, gearman_task_st *task, gearman_result_st *result){ std::string string_value; do { gearman_result_st *result_ptr= gearman_task_result(task); string_value.append(gearman_result_value(result_ptr), gearman_result_size(result_ptr)); } while ((task= gearman_next(task))); gearman_result_store_value(result, string_value.c_str(), string_value.size()); return GEARMAN_SUCCESS;}
Do we have to partition?
(What other tricks exist!)
Pipeline
Store()Store() Resize()Resize() Publish()Publish()
Future
0.32 Released0.32 Released
Custom Logging PluginsCustom Logging Plugins
Client/Worker Configuration Client/Worker Configuration
Extended Administrative CommandsExtended Administrative Commands
SSLSSL
Status lookup via Unique IdentifierStatus lookup via Unique Identifier
Job Result CacheJob Result Cache
Uplift!Uplift!
•Gearman.org (...)
•http://launchpad.net/gearmand/
•twitter: brianaker
•blog: blog.krow.net
Gearman.info