data-driven serverless functions for object...

13
Data-Driven Serverless Functions for Object Storage Josep Sampé, Marc Sánchez-Artigas, Pedro García-López, Gerard París Universitat Rovira i Virgili Tarragona, Spain {josep.sampe|marc.sanchez|pedro.garcia|gerard.paris}@urv.cat ABSTRACT Traditionally, active storage techniques have been proposed to move computation tasks to storage nodes in order to exploit data locality. However, we argue in this paper that active storage is ill-suited for cloud storage for two reasons: 1. Lack of elasticity: Computing can only scale out with the number of storage nodes; and 2. Resource Contention: Sharing compute resources can produce interferences in the storage system. Serverless computing is now emerging as a promising alternative for ensuring painless scalability, and also, for simplifying the development of disaggregated computing tasks. Here we present an innovative data-driven serverless computing middleware for object storage. It is a lightweight compute solution that allows users to create small, stateless functions that intercept and operate on data flows in a scalable manner without the need to manage a server or a runtime environment. We demonstrate through different use cases how our solution scales with minimal overhead, while getting rid of the resource contention problems incurred by active storage tasks. CCS CONCEPTS Information systems Storage architectures; Applied com- puting Enterprise computing infrastructures; KEYWORDS serverless functions, object storage, programmability, data flow interception, openstack swift, cloud computing, data management ACM Reference format: Josep Sampé, Marc Sánchez-Artigas, Pedro García-López, Gerard París. 2017. Data-Driven Serverless Functions for Object Storage. In Proceedings of Mid- dleware ’17, Las Vegas, NV, USA, December 11–15, 2017, 13 pages. DOI: 10.1145/3135974.3135980 1 INTRODUCTION In the last years, many efforts have been made to collocate com- putation close to the data, and in this way, exploit data locality. In particular, active storage [26, 27, 29, 31, 36] has pursued to achieve the following two major benefits: 1. To leverage the processing power of storage nodes; and 2. To considerably reduce the network traffic at the same time. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. Middleware ’17, Las Vegas, NV, USA © 2017 ACM. 978-1-4503-4720-4/17/12. . . $15.00 DOI: 10.1145/3135974.3135980 Despite its numerous benefits, active storage, and even the well proven practice of collocating compute and storage, has been shown to be not suitable for all cloud storage infrastructures. This is truly visible in large-scale storage systems built over commodity hard disk drives (HDDs) like object storage, due to two main problems: Elasticity. By collocating compute and storage, comput- ing resources can only scale out with the number of storage nodes. It is thus impossible to provision compute power in- dependently of the storage, and hence, fulfill the promise of elasticity. Ideally, compute should scale separately, so that a surge of compute power can be available when needed by tenants. Resource contention. One of the main cares of tenants is response time, in particular, for inline real-time services. However, running multiple application functions at the same time in the storage nodes can lead to resource con- tention problems. And what is worse, it can terribly affect the performance of other tenants who share the same stor- age infrastructure. This in turn complicates the correct billing of resource usage. While the second problem has been addressed to a certain level [8, 42] (limiting the resources used by compute tasks), the former cannot be solved with storage resources alone. For these reasons, we argue that active storage is nowadays ill-suited for cloud elastic environments. To overcome these limitations, the new promise of serverless computing can represent an alternative yet powerful solution to these problems. A key insight is that a serverless execution model is ideal to rapidly scale up and down without the need to process data in the storage servers. Commercial systems like AWS Lambda [2], IBM OpenWhisk [19], Google Cloud Functions [13], and Mi- crosoft Azure Functions [21] offer simplified programming models (Function-as-a-Service, abbreviated FaaS) that can scale out trans- parently and avoid users the management of servers (serverless). In this line, recent works [10, 20] are leveraging the simplicity and painless scalability of AWS Lambda to execute massive compu- tations in vast scales over disaggregated object storage resources like Amazon S3. Cloud object stores have become an alternative to store the sheer volume of data at an attractive price. Cloud tenants have begun to utilize object stores as their main storage substrate instead of compute collocated cluster storage such as HDFS. For example, Netflix have been exclusively using Amazon S3 as its primary storage solution [22]. In this work, we present a novel data-driven serverless comput- ing middleware for object storage called Zion. We aim to solve the scalability and resource contention problems of active storage, while benefiting from data locality to reduce latency by placing computations close to the data. Our model is data-driven and not 121

Upload: others

Post on 25-Apr-2020

16 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data-Driven Serverless Functions for Object Storagefaculty.washington.edu/wlloyd/courses/tcss562... · using AWS Lambda for data-intensive tasks over disaggregated storage [10, 20]

Data-Driven Serverless Functions for Object StorageJosep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

Universitat Rovira i VirgiliTarragona Spain

josepsampe|marcsanchez|pedrogarcia|gerardparisurvcat

ABSTRACTTraditionally active storage techniques have been proposed to movecomputation tasks to storage nodes in order to exploit data localityHowever we argue in this paper that active storage is ill-suited forcloud storage for two reasons 1 Lack of elasticity Computing canonly scale out with the number of storage nodes and 2 ResourceContention Sharing compute resources can produce interferencesin the storage system Serverless computing is now emerging as apromising alternative for ensuring painless scalability and also forsimplifying the development of disaggregated computing tasks

Here we present an innovative data-driven serverless computingmiddleware for object storage It is a lightweight compute solutionthat allows users to create small stateless functions that interceptand operate on data flows in a scalable manner without the needto manage a server or a runtime environment We demonstratethrough different use cases how our solution scales with minimaloverhead while getting rid of the resource contention problemsincurred by active storage tasks

CCS CONCEPTSbull Information systemsrarr Storage architectures bull Applied com-puting rarr Enterprise computing infrastructures

KEYWORDSserverless functions object storage programmability data flowinterception openstack swift cloud computing data management

ACM Reference formatJosep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes 2017Data-Driven Serverless Functions for Object Storage In Proceedings of Mid-dleware rsquo17 Las Vegas NV USA December 11ndash15 2017 13 pagesDOI 10114531359743135980

1 INTRODUCTIONIn the last years many efforts have been made to collocate com-putation close to the data and in this way exploit data locality Inparticular active storage [26 27 29 31 36] has pursued to achievethe following two major benefits 1 To leverage the processingpower of storage nodes and 2 To considerably reduce the networktraffic at the same time

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page Copyrights for components of this work owned by others than ACMmust be honored Abstracting with credit is permitted To copy otherwise or republishto post on servers or to redistribute to lists requires prior specific permission andor afee Request permissions from permissionsacmorgMiddleware rsquo17 Las Vegas NV USAcopy 2017 ACM 978-1-4503-4720-41712 $1500DOI 10114531359743135980

Despite its numerous benefits active storage and even the wellproven practice of collocating compute and storage has been shownto be not suitable for all cloud storage infrastructures This is trulyvisible in large-scale storage systems built over commodity harddisk drives (HDDs) like object storage due to two main problems

bull Elasticity By collocating compute and storage comput-ing resources can only scale out with the number of storagenodes It is thus impossible to provision compute power in-dependently of the storage and hence fulfill the promise ofelasticity Ideally compute should scale separately so thata surge of compute power can be available when neededby tenants

bull Resource contention One of the main cares of tenantsis response time in particular for inline real-time servicesHowever running multiple application functions at thesame time in the storage nodes can lead to resource con-tention problems And what is worse it can terribly affectthe performance of other tenants who share the same stor-age infrastructure This in turn complicates the correctbilling of resource usage

While the second problem has been addressed to a certain level[8 42] (limiting the resources used by compute tasks) the formercannot be solved with storage resources alone For these reasonswe argue that active storage is nowadays ill-suited for cloud elasticenvironments

To overcome these limitations the new promise of serverlesscomputing can represent an alternative yet powerful solution tothese problems A key insight is that a serverless execution modelis ideal to rapidly scale up and down without the need to processdata in the storage servers Commercial systems like AWS Lambda[2] IBM OpenWhisk [19] Google Cloud Functions [13] and Mi-crosoft Azure Functions [21] offer simplified programming models(Function-as-a-Service abbreviated FaaS) that can scale out trans-parently and avoid users the management of servers (serverless)

In this line recent works [10 20] are leveraging the simplicityand painless scalability of AWS Lambda to execute massive compu-tations in vast scales over disaggregated object storage resourceslike Amazon S3 Cloud object stores have become an alternative tostore the sheer volume of data at an attractive price Cloud tenantshave begun to utilize object stores as their main storage substrateinstead of compute collocated cluster storage such as HDFS Forexample Netflix have been exclusively using Amazon S3 as itsprimary storage solution [22]

In this work we present a novel data-driven serverless comput-ing middleware for object storage called Zion We aim to solvethe scalability and resource contention problems of active storagewhile benefiting from data locality to reduce latency by placingcomputations close to the data Our model is data-driven and not

121

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

event-driven because our computations are located in the datapipeline and intercept the data flows that arrive and return fromthe object store

In practice there are many jobs that require functions to bepart of the data pipeline to provide optimal response times Zionrsquosspotlight is on these jobs use cases that require of synchronous in-teraction between customers and object storage Examples of theseuse cases include (but are not limited to) dynamic content genera-tion interactive queries content verification access control whichare better suited for a data-driven serverless computing model Mostof these use cases are hard or impossible to implement in currentplatforms (eg AWS Lambda) due to operational requirements forinstance because they need to transparently intercept incomingrequests to the object store or because they require interactivecommunications through the standard object storersquos API

In contrast our model is a lightweight solution that allows usersto create small stateless functions that intercept and run on dataflows in a scalable manner without the need to manage a server or aruntime environment We have coded a prototype implementationof Zion for OpenStack Swift where the serverless compute layerlies between the proxy and storage nodes This has allowed us tomaximize write and read performance per-worker to storage nodesWe show through several use cases how our solution scales up withminimum overhead and no resource contention

The contributions of this work are the following

(1) We confirm the scalability and resource contention issuesof collocation (compute close to the data) in object storageOur key insight is that when a cloud solution cannot affordto include a non-scalable computing layer collocated withthe storage nodes disaggregation is the answer

(2) We present the new notion of data-driven serverless com-puting Unlike previous offline asynchronous event-drivenapproaches our data-driven model is aimed for inline andsynchronous interactions with a data store

(3) We demonstrate the feasibility of our model with differentpractical use cases With free containers available Zionrsquosoverhead is of 9 ms which is well amortized by the func-tionsrsquo execution time typically in the second range for theaverage data-driven use cases

The remainder of this paper is organized as follows In Section2 we summarize the background and the related work and discusstheir advantages and shortcomings In Section 3 we describe thedesign of Zion In section 4 we show the implementation of Zion ona real system based on OpenStack Swift In section 5 we presentsome applications which our system supports Finally in section 6we evaluate Zion and the proposed applications

2 RELATEDWORKMoving computation close to the data to benefit from data localityis nowadays a commonplace idea In databases stored proceduresand co-processors [4 12 40] have standard interfaces and havealready been in use in many production environments

In the context of storage systems computation close to the datahas been studied as active storage Active storage [31] has ledto a plethora of research works proposing platforms for different

storage systems such as file systems [26 35] object stores [27ndash29 32 36 42] and software-defined storage architectures [14 15 34]Many of these works proved significant data transfer reductionsand optimizations derived from data locality

But active storage has inherently some downsides that precludeits general applicability in the Cloud 1 Compute tasks are lim-ited by the available resources at storage nodes which hindersscalability and 2 It requires of resource management to limit ac-cess to scarce or sensitive resources at storage nodes If resourcesare not properly managed active storage can then incur resourcecontention problems when computing tasks are heavy

Some works tried to solve the resource contention problems bylimiting the execution of active storage tasks or by delegating thecomputing tasks to external services [8 41 42] In any case theycannot solve the scalability problem that makes these systems lessattractive for elastic cloud settings

Even worse data locality is becoming less and less relevant dueto considerable advances in datacenter networking This makesof disaggregation an attractive principle for both storage [6 18]and in-memory systems [11] Following this trend highly scal-able serverless platforms such as AWS Lambda are becoming verypopular these days to run asynchronous computing tasks over dis-aggregated object storage like Amazon S3 Even recent works areusing AWS Lambda for data-intensive tasks over disaggregatedstorage [10 20]

However serverless platforms like AWS Lambda are mainlydesigned for asynchronous event-driven computing tasks over thedisaggregated storage resources For example when an image isuploaded to S3 the upload event will trigger a Lambda functionthat will then write to S3 asynchronously different versions of thesame resized image

Amazon offers the AWS API Gateway service as a mechanism tosynchronously call lambda functions which might read data fromS3 and output transformed content to the client This mechanismcan be useful for a number of applications but also it incurs extraoverhead as functions are not in the readwrite path fromto S3Further applications have to work with another API instead of withthe simple S3rsquos API In this sense AWS Lambda does not supportinterception of object storage requests and its corresponding inlineprocessing as Zion

Another related technology is Amazon LambdaEdge [2] WithLambdaEdge it is possible to intercept requests in CloudFront andeven modify the request and response However LambdaEdgehas stringent computing limitations concretely 3 seconds of ex-ecution time which reduces significantly the range of potentialapplications In general it has been designed for header and meta-data manipulation and for very lightweight computations and datatransformations

Even if it is not directly related work Amazon Athena [1] andRedshift Spectrum [3] or Facebook Presto [9] can also provide in-teractive queries over large scale object repositories like S3 In par-ticular Presto offers SQL interactive queries where all processing isin memory and pipelined across the network between stages Likein our model the pipelined execution avoids unnecessary IO and

122

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

Storage

API

StorageGateway

StorageNode

Hard D

isk

StorageNode

MetadataService

StorageGateway

StorageNode

Computation Layer

MiddleboxMiddleboxMiddleboxMiddleboxComputeNode

Storage Network

Hard D

iskH

ard Disk

Figure 1 High-level architecture overview of the proposed model This figure shows the basic components of the proposedarchitecture to run interception functions in an object storage system

associated latency overhead Unlike our model these systems can-not provide generic function computations since they are focusedon interactive queries

As we will explain in the rest of the paper our model enablesinterception and transformation of data flows directed to the objectstore We place a scalable compute layer in the data path betweenproxy and storage nodes to reduce latency for interactive commu-nications

3 DESIGN OVERVIEWZion has been designed for scalable data-driven execution of smallfunctions in object stores And thus all components revolve aroundthe object storage service Zionrsquos design therefore assumes that theunderlying object store follows the ldquoclassicalrdquo architecture of loadbalancers distributing the workload evenly across the gateways orproxies and a large pool of storage nodes which are in charge ofstoring the objects in the hard disks

As a first overview Fig 1 shows a diagram of Zionrsquos architec-ture To not interfere with both plain PUTGET requests and storagemanagement tasks such as replication and failover Zion lays out adisaggregated computing layer between the storage and gatewaynodes for executing the functions Zion also integrates a metadataservice and an interception software running in the storage gate-ways which inspects incoming requests and reroutes them to thecompute tier if necessary

31 Interception Software and Metadata ServiceThe first component of the system is the interception layer whichis integrated in the storage gateway (depicted as a router in Fig 1)The major aim of this software is to manage the deployment offunctions the association of triggers to these functions and theirexecution when a request matches a trigger

A trigger is a combination of a URL with prefix and suffix filterssimilar to AWS Lambda for Amazon S3 and a HTTP method (GETPUT POST and DELETE) This interception mechanism is enoughfor many use cases By specifying the suffix txt as a filter forinstance Zion can run a compression function to all GET requestsfor text objects The list of triggers is the following

bull onPut onPost and onDelete which cause the execu-tion of the associated function whenever a PUT POST orDELETE request is received respectively As an example

the onPut trigger can be useful to process an object beforeits write to the object store and even discard its storage asprocessing is part of the write path and not asynchronousas in AWS Lambda

bull onBeforeGet a function associated to this trigger is ranwhen a user performs a GET request to the storage serviceThis is one of the two cases associated to the GET requestsIn this case the function is executed before forwarding therequest to the storage node and hence the function cannotprocess the targeted data object However this trigger canbe useful in many use cases like HTTP headers processingURL rewriting temporarily redirects etc

bull onAfterGet which causes any function associated to thistrigger to run on an incoming GET request to the storageservice In this case the function intercepts the storagenodersquos response and therefore it can dynamically manip-ulate the objectrsquos content

The metadata service is the second architectural component ofZion Metadata for triggers is pre-processed and indexed efficientlyin the metadata service to guarantee a small O(1) request matchingoverhead of the order of microsecs If there is any match this layer isalso responsible for redirecting the input flow as the object is readto an available worker Non-intercepted data flows rapidly followthe default storage path and bypass the serverless compute layeras usual

It is worth to note here that the fact that a serverless executionmodel can quickly spawn new workers is what makes it possible tointercept and process data flows ldquoon-the-flyrdquo without collocationWith VMs interception will be significantly more difficult to achieveas VMs can take minutes to start

32 Computation LayerThe third architectural component is the serverless computationlayer The computation layer is a pool of containers which puts thefunctions into execution

321 Functions A function is the computation code unit whichcan process the data In our model functions are data-driven iethey are focused to intercept the data flow and process the datainline as the object comes-in or comes-out from the storage clusterBecause of this in our model the response time from the functions(time to first byte) must be fast to not affect the user experience

123

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

In addition to processing the data stream functions can storeinformation in a persistent way (eg an access counter the times-tamp of the last access the name of the user who is accessing etc)Concretely a function can take one or all of the following actionsafter intercepting a lifecycle request of an object

bull It can update the object metadata (eg an access counter)bull It can generate new requests to the object storage service

This includes GET an object for example a dependencyneeded to process the main stream PUT an object forexample PUT a subset of the main object as it is processedDelete an object and POST metadata to another object

bull It can generate new requests to other services (eg Rab-bit Mongodb etc) for example to store some relevantinformation extracted of the main stream

bull It can update the requestresponse headers andbull It can cancel or rewire the request to another object

Functions may make use of third-party library dependenciesin order to achieve a specific behavior Once developed functionsshould be packed with them within a TAR file Therefore in con-junction functions and their dependencies must be lightweight insuch a way to minimize the time needed to transfer the functionpackage from the storage system to the compute node

Once packed a function is uploaded as a regular object to theobject store An interesting feature of Zion is that it allows theuser to set up the CPU and memory requirements and a timeoutvalue for every function The timeout is the amount of time thesystem waits to receive the first byte of the functionrsquos output Ifthe function times out the request is automatically canceled Thisinformation allows functions to be configured differently in order tobetter manage certain jobs This information is not mandatory andZion has the last word assigning default values when necessary

Further Zionrsquos functions accept parameters Parameters can beexplicit or implicit Explicit parameters are provided as headers inthe request Implicit parameters are default parameters that a userspecifies ahead of time in the moment of associating a trigger witha function Explicit parameters take precedence over implicit onesin the case of collision As an example consider a function to resizean image and the image resolution as a parameter If no argumentwas passed in the request the implicit image resolution would betaken from the functionrsquos metadata or an error would be thrownaccordingly The same function can have different parameter setsfor different triggers

A final important remark is that two different functions cannotintercept the same data flow in Zion unless they do it in a pipelinefashion one after another which raises no consistency issues

322 Compute Node In a compute node functions are runinside isolated environments or containers

Containers Each function has its own Linux container in orderto not interfere with the other cloud functions A container with afunction running inside is called a worker A function may have zeroone or more workers running at the same time depending on theworkload In the traditional function model starting a new workertakes around 6 minus 7 seconds [17] One requirement of our model isthat functions have to start running as soon as possible for thisreason we leverage a ready-to-use pool of containers In any case

our experiments verify that starting a new container takes around09 seconds which is practical enough for many synchronous andnear-real-time applications After a certain period of time the idleworkers are stopped and the corresponding containers are recycledin order to better optimize resource consumption

Zion Service This service manages the requests forwarded fromthe interception layer (31) When a new request arrives it takes acontainer from the pool installs the libraries and the function codeand sends the execution command to the runtime As functions mayhave specific CPU and memory requirements this service is also incharge of establishing the resource limits to containers according totheir configuration parameters It also load balances the workloadacross already started workers and starts new workers if necessary

Runtime The runtime is integrated into the containers It acceptsfunctions for a specific programming language and puts them intoexecution Our prototype of Zion currently supports Java but otherlanguages such as Python would be easy to integrate

4 IMPLEMENTATIONWe have implemented a prototype of our serverless framework [33]on top of OpenStack Swift [24] Swift is a highly scalable object storethat can store a large amount of data through a RESTful HTTP APIsimilar to that of Amazon S3 The access path to an object consistsof exactly three elements accountcontainerobject

The object is the exact data input by the user while accounts andcontainers provide a way of grouping objects Nesting of accountsand containers is not supported

Swift is split into several components which include accountcontainer and object servers the latter responsible for handlingstorage and management of objects All of these servers are typi-cally deployed in the storage nodes themselves In addition proxyservers expose the Swift API to users and stream objects to andfrom the client upon request In Swift data replication and dataplacement is provided by rings Each ring is a storage policy thatcontains the storage nodes and the HDDs where to store the objects

We have decided to make Zion the less intrusive as possible Asa result the only modification in the base Swift architecture is aSwift middleware which intercepts the requests at the proxy sideThe other elements are decoupled of the main Swift architecturewhich makes Zion easier to deploy

41 Interception Software and Metadata ServiceIn Swift the simplest way to intercept requests is to create a Swiftmiddleware We built a new Swift interception middleware forZion to accomplish two primary tasks 1 The management of func-tion code deployment and libraries including the triggers thatcause the functions to be run and 2 Redirection of requests andresponses through the computation layer when they need to beprocessed by any function

Upon the assignment of a function a trigger header of the typeonPut onBeforeGet onAfterGet onPost and onDelete must beappointed to tell the framework which lifecycle events to inter-cept Zion uses Redis [30] a quick in-memory key-value store asdistributed metadata service to maintain this information To opti-mize request matching Redis is collocated with the proxy nodes

124

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

Recall that as part of the metadata Zion also includes configura-tion information for the functions such as the CPU and memoryrequirements and cancellation timeouts as we already discussedin the preceding section

Function assignment granularity Although OpenStack Swiftstores the objects in a flat name space is possible to simulate ahierarchical structure within a single container by adding forwardslash characters () in the object name Each folder between thecontainer and object names is called a pseudo-folder in Swift Forexample in the object name imageszionjpg the prefix imagesis the pseudo-folder

In Zion we take advantage of this logical hierarchy to enablefunction assignment at per-object pseudo-folder and container lev-els Moreover we also enable mappings at suffix level for exampleto run functions to all objects whose name ends with jpg

Function execution After the function assignment process whena request arrives for an object the Zionrsquos middleware with the helpof the metadata service checks if that request triggers the executionof a function For example if the request is a PUT the middlewarewill launch an onPut trigger for the request If there are functionsthat respond to this trigger the middleware will immediately for-ward the request to an available container Otherwise the requestwill follow the default readwrite path

Parallel processingwith functions In all object stores there is alimit in the maximum allowed object size In Swift this limit is 5GBTo bypass this limitation Swift uses a special object called StaticLarge Object [25] (SLO) SLOs are objects split into differentparts The user must upload these parts together with a specialobject called manifest which contains the location of the objectparts Getting an object is totally transparent to the users whomake a request to the manifest and the whole object is returned asif it was stored as a single object in Swift

This Swift SLO approach enables Zion to associate a differentfunction to the manifest and to the parts With this mechanism itis possible to create a highly parallel and distributed computationalsubstrate by executing a function to each dataset part and finallyby running a reduction function to the filtered parts In Section 5we present a Hadoop-like use case for object storage implementedwith our functions This is aligned with the recent trend of largescale data analytics with serverless computing [10 20]

42 Computation LayerIn our implementation the computation layer is composed by apool of compute nodes They are located between the proxies andthe storage nodes as shown in Fig 1 We note at this moment thatthe compute nodes assigned to Zion are not shared with OpenStackNova [23] they are exclusively managed by Zion

A general overview of how Zion operates is the following Eachfunction is run inside a separate Docker container what is called aworker At the time of this writing Zionrsquos runtime is Java-basedAnd consequently every function is run in a Java Virtual Machine(JVM) At a very high level a worker can be viewed as a containerrunning a specific function Every new invocation to the functionis handled by a new thread inside the JVM

421 Functions As the Zionrsquos runtime is based on Java Zionrsquosfunctions are also written in Java This means that functions areplain Java classes They have an entry point called invoke whichcontains the main code of the function and that it is called uponevery new request The invoke method has two arguments of typeContext and API which are made available automatically by theZionrsquos runtime on every new request

The Context encapsulates the access to the request headers theobjectrsquos metadata and the objectrsquos data stream It also includes alogger facility for logging the execution of the function The APIenables access to external services By default the isolation level ofDocker containers precludes functions from having any access toexternal resources Through the API a function can access to someexternal services such as RabbitMQ and Redis in addition to Swiftitself The API class is extensible and has been framed to facilitateexternal interaction to future services

Here we have the implementation code of a simple function thatiterates over the data with no further processing

Listing 1 A function that echoes the data passed to itpublic c l a s s Handler implements I F u n c t i o n

public void invoke ( Contex t c tx API a p i ) while ( ( d a t a = c t x o b j e c t s t ream read ( ) ) )

c t x o b j e c t s t ream w r i t e ( d a t a )

422 Compute Node The complete architecture for computenodes is depicted in Fig 2 and consists of

Containers We use Docker containers to sandbox functions sothey have neither direct network access nor system execution andthread creation capabilities nor access to the local file system Onlya minimum ephemeral hard disk space is provided under the tmpdirectory to be used by functions We used the Ubuntu 1604 Dockerimage [38] for our implementation

Zion Service The Zion service is a WSGI server whose missionis to spawn workers for processing the request forwarded by theinterception middleware Consequently it must be up and runningin all compute nodes because as described in 322 it is responsiblefor managing containers and executing functions When a functionis running this service waits until receiving a response from itThe waiting time is limited by the default systemrsquos timeout or by auser-defined timeout specified at deployment time

Runtime For the runtime we created a Docker image based onUbuntu 1604 with the Java runtime installed on it to rapidly spawnnew containers and execute functions

As stated in previous section each different function is run in aseparate container If the compute node does not have any workeravailable for that function the Zion service takes a Docker con-tainer from the pool of containers and loads the function code andlibraries Subsequent requests will then be able to reuse the sameworker when processing the same ldquohotrdquo function

One feature of our functions is that it is possible to modifythe objectrsquos metadata (keyvalue) via the Context By defaultSwift stores 3 copies of each object However Swift is an eventuallyconsistent system This means that at a given time the replicas

125

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

Zion Service

Java Runtime

Java VM

Function 1

Function 1Worker 1

Java Runtime

Java VM

Function 1

Function 1Worker 2

Java Runtime

Java VM

Function 2

Function 2Worker 1

ToFrom Proxy TF Storage Node

Java Runtime

Docker waiting for a new function worker

Figure 2 Zion Compute Node High level architectureoverview of the components of a Zion Compute Node

of an object may have different data and metadata As functionsare stateless ie there is no relation between the different functioninvocations even if they occur in the same worker modifying thelocal copy of an objectrsquos metadata implies waiting for Swift toupdate all the replicas This behavior is clearly undesirable in termsof functionrsquos parallelism

To mitigate this issue each function worker is directly attachedto a distributed metadata caching system based on Redis It is aninternal feature totally transparent to the users who develop func-tions When a function is executed it initially loads the objectrsquosmetadata into the cache and then performs all the metadata modi-fications over it If there are other requests to the same object theother function invocations will use the previously-loaded metadataWhen the function terminates it offloads the metadata from thecache to all objectrsquos replicas

With this distributed cache we ensure that the objectsrsquo metadatatouched by the functions is always consistent in all replicas evenif there are multiple requests at the same time With it functionscan even do atomic operations over the metadata fields

5 APPLICATIONSHere we show some of the applications that Zion can support andthe ease with which these applications can be built on top of SwiftAs expected the described use cases are not arbitrary they havebeen chosen to show the potential of our data-driven model and inparticular for synchronous interactions with the object store Allof them have been implemented over Swift and evaluated in thenext section

Content-level Access Control With Zion it is extremely easyto implement sophisticated forms of access control based upon thecontents of the objects themselves This example is ideal to showthe limitations of AWS Lambda First it clear that access controlrequires the interception of the GET requests which can only bedone indirectly with the help of the API Gateway service Secondthe processing of the objectrsquos content to satisfy the access controlpolicy must be done inline as the user needs a timely notificationabout the status of her request which again cannot be realizedwith AWS Lambda Finally Zionrsquos design enables access controlto leverage data locality and improve response time as functions

go through the contents of the object as it is read from the storagenode

Typically access control in object storage operates at the gran-ularity of containers and hence once an object is accessible tosome party he gets the full content of the object Swift also followsthis ldquoall or nothingrdquo approach where the access to objects insidea container is enforced through access control lists (ACLs) whichmay be insufficient when objects contain sensitive content

In the exercise to show another capability of our framework weshow how content level access control can be realized very easilyin Swift thanks to our function abstraction in a similar fashion butcleaner than [7] By ldquocontent levelrdquo we mean that Swift users will beable to access to certain parts of an object based on their credentialsTo give a concrete example consider the publicly available Adultdataset from the UCI Machine Learning Repository [39] which con-tains about 48 000 rows of census information Each row containsattributes like race sex and marital-status which combinedwith explicit identifiers such as the SSN1 that identify the recordholders may leak sensitive information about a given individualAs a result the records of this object should be accessed differ-ently depending upon the user role For instance while a ldquopoliceagentrdquo should be able to access to all fields SSN age educationmarital-status race sex relationship capital-gain andnative-country a ldquocensus analystrdquo could be restricted to get onlya smaller view age education capital-gain native-country

To implement this example we have linked a function to theonAfterGet object trigger to enforce content level access controlon the object adult_datacsv We have defined a simple accesspolicy that depending on the use role ldquopolice agentrdquoor ldquocensusanalystrdquo allows to get all the fields or just an smaller projectionview This simple access policy has been stored as implicit parameterof the function ie in a JSON formatted string uploaded when welinked the object with the function as explained in 321 When aGET request comes for the object adult_datacsv the proxy firstchecks the Swift ACL If the object is accessible by that user thefunction then reads the content level policy and filters the dataonly if the user has the appropriate role

Compression A typical data reduction task is the compressionof objects In general any dynamic content filtering that involvesinline transformations of the content is ideally suited for our data-driven model A data-driven task could compressdecompress a filedynamically ldquoon-the-flyrdquo As Zion acts directly on the data pipelineie as the object is readwrite to the object store the result willbe either latency improvement or space reduction depending uponwhether Zion intercepts a GET request or a PUT request

Here we will merely consider the case of compressing incomingobjects ldquoon-the-flyrdquo which means that upon a GET request by theuser the target object will have to undergo decompression Thecommon approach for doing so in Swift is to implement an ad-hoccompression middleware However this approach is problematicFirst by running compression on the storage nodes compressionis repeated as many times are replicas there are By offloading it tothe proxies we can disrupt Swiftrsquos normal operation under heavy

1As the Adult dataset does not contain explicit identifiers we added a random SSN toeach row using the Fake Name Generator

126

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

load since the proxies are responsible for looking up the locationof objects in the rings and routing the requests accordingly

With Zion we can easily write down a compression functionand execute it between the proxies and the storage nodes in a scal-able way without worrying about resources or repeating the taskmany times In addition to the possibility to intercept GET requestsZion has another advantage over AWS Lambda It is capable to runcompression over objects whose total compression time exceed thefive minutesrsquo limit This is because Zion cancels a request only ifthe time for the receipt of the first byte from the function exceeds atimeout value (see Section 321 for details) This model is ideal foroperations such as compression that can be run as data is readwritewhich have been the focus of active storage for a long time

To implement the compression function we utilized gzip andthen we mapped it to a container with the onPut trigger As suchall writes to this container will be compressed We did the reverseprocess for the onAfterGet trigger so that when a user requestedan object of this container she would get the original uncompressedversion To do so we made use of an implicit parameter to tell thefunction what to do either to compress or decompress That is forthe onPut trigger the implicit parameter was set to ldquocompressionrdquoFor the onAfterGet trigger the implicit parameter value was setto ldquodecompressionrdquo

Image processing One of the archetypal use cases of serverlesscomputing is that of image resizing for we found it interesting toevaluate it here It is ideal to show the potentials of asynchronousevent-based functions such as AWS Lambdas and also very usefulfor tenants that use the object store as a back-end for web imagestorage In this case when the images are uploaded a function istriggered resizing and creating all the needed images for examplefor the different possible devices that can request the main web(phone tablet PC etc)

We did the same with Zion and coded a function that resizes animage to an arbitrary percentage To intercept the PUT requestswe linked it to an onPut trigger and specified jpg as the suffixof the object name So all objects whose name ended with jpgwere rescaled As Zion allows to create new objects as part of thefunctionrsquos output the function stores the original object and itsresized version(s)

Because of interception one interesting feature of Zion is that itdoes not require to fully store an image before the generation of itsresized version(s) as it occurs with AWS Lambda and S3 Since it isdone ldquoon-the-flyrdquo prior to storage Zion saves storage bandwidthAlthough of not much concern at first glance this property is veryinteresting for concurrently processing a vast collection of imagesas in [10] because IO savings add up rapidly for a large amount ofobjects

Signature verification To demonstrate the versatility of Zionwe proposed the signature verification use case Online contentchecking is again a perfect use case for data-driven functions sinceit requires an immediate response to the client

The main objective of this function is to verify that the objects(documents) that are uploaded are signed by the user and to verifythat the signature is valid ie the documents are authentic WithZion is possible to do this task in near-real time and notify the userinstantly in the case of rejection Also in the case of rejection we

prevent the archival of an inauthentic document thereby savingstorage space For the same reasons as above it is readily evidentthat this use case cannot be implemented with AWS Lambdas

The scenario is as follows The user obtains the public key fromher RSA key pair and she uploads it to a public container in theobject storage system Then the user signs the document with theprivate key and uploads it with the signature in a special header tothe object storage system which puts into execution the functionand verifies the document Note that in this case the function usesan explicit parameter (signature) described above

To do so we implemented a signature verification function Thefunction is mapped with a container to the onPut trigger Thereforeall object PUTs to this container will be enforced The function firstgets the public key from the object store based on the user who isuploading the object Then it loads the document content and itverifies the signature If it is valid the document is stored otherwisethe document is rejected sending an error message to the user

Interactive queries and result aggregation Finally interactivequeries is a use case that perfectly matches our data-driven modelWhen we want to perform fast data queries over existing datarepositories our data-driven model avoids moving the entire datasetto a computing cluster

For example object storage services are commonly used toarchive data like log files Businesses that want to extract fast in-sights from these data repositories using big data analytic toolsmust choose between two strategies 1 Moving data to the com-putation cluster to take advantage from data locality or 2 Usinga connector to allow data analytic tools to read data directly fromthe remote object storage With Zion we offer a third strategyCompute in the storage cluster using functions that filter data andaggregate results inline

The case example we present is a top-k query on access logsof UbuntuOne [16] a personal cloud service We want to obtaina list of the most active users and the total number of requestseach user sent The implementation of this query in Zion has twofunctions The first one receives a fragment of the log file and filtersrequests logs maintaining a counter for each user thus exploitingparallelism The other function receives as input the partial countersof the various instances of the first function and it performs theaggregation and sorting in order to produce the desired result Toimplement this mapreduce-like example we tapped into SwiftrsquosSLO-based data partitioning (see Section 41 for further details)

6 EVALUATIONFor confirming our suspicions we first studied how Swift behaveswhen some computation tasks are collocated in the same storagenode We then ran micro-benchmarks to measure the behavior andthe overheads of our functions We did so by running standaloneexperiments and the applications discussed in Section 5

61 System SetupOur experimental testbed consisted of a host (or client) with 2VCPUsand 4GB RAM On the server side we deployed Zion in an 8-machines rack OpenStack Swift (ocata version) installation formedby 2 proxy nodes Dell PowerEdge R320 with 12GB RAM and 6storage nodes Dell PowerEdge R320 with 8GB RAM (each one with

127

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

06

12

18

24

3

Respon

seTim

e(secon

ds) 1 Worker

2 Workers

3 Workers

4 Workers

(a) Transaction response time

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

20

40

60

80

100

CP

U U

sage

()

1 Worker

2 Workers

3 Workers

4 Workers

(b) Storage Node CPU Usage

Figure 3 Swift measurementMeasured response times (a) and CPU consumption for a given storage node (b) for getting objectsat a certain rate (transactions per second or TPS) for different values of Swift workers

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(a) 1 Object Server worker

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(b) 2 Object Server workers

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(c) 3 Object Server workers

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(d) 4 Object Server workers

Figure 4 Swift interference measurement These plots show how for a given storage node the swift object storage service behavesfor different number of workers and for different CPU loads of interference (Intf)

4 CPU cores) At the compute side the computation layer is com-posed by 3 nodes Dell PowerEdge R430 with 32GB RAM (each onewith 24 CPU cores) All the rack machines ran Ubuntu Server 1404The client host ran Ubuntu 16041 CloudImage [37]

62 Swift Resource ContentionWe first studied how collocation of compute and data affects SwiftTo do so in ldquoidealrdquo conditions we restricted this experiment to asingle storage node2 For this measurement we used a single proxynode Dell PowerEdge R320 with 12GB RAM and 1 storage nodeDell PowerEdge R320 with 8GB RAM

Base Swift The first experiment consisted of evaluating how Swiftnormally works (Fig 3) For this experiment we stored a bunchof random 10kB files into Swift Next using the unix httperftool we ran distinct workloads each one differing in the numberof transactions per second (TPS) We measured the resultant per-transaction response time (Fig 3a) and CPU usage (Fig 3b) of thestorage node

Technically Swift uses workers to handle a workload In Swift aworker is nothing but a thread that accepts requests Each worker

2 We note that the results are easy to extrapolate to larger Swift deployments

normally accepts 1 024 concurrent requests but it is a config-urable parameter This simple experiment confirmed us that whena worker exhausts the 100 of its corersquos resources the responsetime steeply increases due to the queuing delays For instance with1 worker the corersquos usage reaches 100 around 380 TPS the pointbeyond which the requests start to accumulate as shown in Fig3a This effect can be alleviated by starting new Swift workers asseen in this figure However since storage nodes have 4 cores onlySwift is limited to 4 workers And therefore a storage node willeventually end up experiencing resource contention sooner or latereven if there are no computations collocated with the data

Collocated Computation We repeated the same experiment butcollocating computations with the storage service The results inFig 4 show how the Swiftrsquos processing capacity diminishes as afunction of the CPU usage borrowed from the collocated tasks Forexample if Swift was restricted to use only 1 worker for requestprocessing collocated tasks would have almost no impact on SwiftHowever as the number of Swift workers increase to match thenumber of CPU cores the resource contention begins to producenegative effects in the storage system due to the interference fromthe collocated computations

128

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

Swiftrsquos best practices recommend to use as many workers asCPU cores Under a heavy use of the CPU by the collocated tasksthis figure shows that Swift request processing would be severelydiminished For instance when the collocated tasks consumed 80of the total CPU Swift processing fell to a 35 in this experimentThis result suggests that the right path to go is to move computationsoutside the storage nodes which is what we did with Zion

63 Application SetupNow we describe the specific evaluation setup for our example usecases As object stores can contain a huge diversity of object typesand sizes [16] we set up heterogeneous workloads for all use casesin order to verify the behavior of Zion under different scenarios

Content-level access control For testing this application weused the dataset described in Section 5 but trimmed down to differ-ent object sizes 100kB 1MB 10MB and 100MB As stated beforethe dataset content is filtered out according to the type of userthat requested the object To this aim we used Swift user roles toreturn only specific fields to each user The function associated tothe onAfterGet trigger reads from an implicit parameter the JSONstring containing the allowed fields for a specific userrsquos role (egage education marital-status) and then returns them to theuser

Compression It is well-known that the compression ratio of ob-jects affects resource consumption Objects that contain only zeroswill be compressed more quickly and consume less resources thancompressing a binary object To get rid of this issue we chose touse text documents with a similar compression ratio of around60 The documents were of sizes 10kB 100kB 1MB and 10MBrespectively

Image processing As in the previous case we tried to find themost common scenario for testing this function We focused onthose image sizes commonly used in static web pages Finally weused different jpg files of 90kB 400kB 800kB and 1200 kB andwe set an implicit parameter so that the function resizes the imageto the 50 of its original size

Signature verification A sample usage of this function may be toverify official documents and forms in a public institution For thisexperiment we used text documents also of different sizes 10 kB100 kB 1 MB and 10 MB These documents are signed with a RSAprivate key The experiment operates on PUT requests verifyingthe correction of the signature

Interactive data queries and result aggregation For this usecase we used different sizes of the publicly available UbuntuOnersquoslog file [16] 100MB 1GB and 10GB respectively We comparedZionrsquos functions execution time to those obtained using HadoopWe built a 10-node Hadoop cluster of commodity workstations4-core i5 at 253 GHZ and 16 GB of RAM For this application weissued a Pig query against 1 The log files stored in HDFS and 2Swift using the Apache Hadoop-Swift connector [5] For Zion first-stage filtering functions we picked chunk sizes of 10MB 32MBand 200MB for the 100MB 1GB and 10GB files respectively

Table 1 Function information

Application LOC Function Size

Content-level Access Control asymp 29 27 kBCompression asymp 8 18 kBImage processing asymp 17 23 kBSignature verification asymp 43 29 kBInteractive data query asymp 203 6 kB

08 085 09 095 1

Startup Time (seconds)

0

30

60

90

120

150

Numof

Dokcers

Figure 5 Zion runtime startup time Time in seconds that aDocker with the Zion Java runtime takes to start

64 Application characteristicsTable 1 shows information of our four Zion applications The LOCcolumn gives the number of lines of code required to execute thefunction The third column gives the Function size used for eachapplication From this table it can be seen that our functions arevery lightweight for the proposed applications

65 Results

Docker characteristics The Java runtime and then the functionsare run inside Docker containers in our prototype Consequentlythe first validation to do was to assess the impact of starting ourruntime within Docker containers If the startup time was too largeit would hamper the inline processing capabilities of Zion Ourexperiments however revealed that this is not the case

For this experiment we utilized a pre-compiled Docker imagewith our Java runtime integrated in it The experiment consisted oflaunching 1 000 containers at different rates in the compute clusterand measure the startup time and memory along with the amountof memory for keeping the runtime up and running over time

Fig 5 shows the results of this experiment As depicted startinga new Docker with the Java runtime takes between 085 minus 095seconds Regarding RAM consumption we got that each containerconsumes around 35MB of memory These results show how ourruntime takes 5X times less to start in comparison with the AWSLambdarsquos runtime which proves that Zion is lightweight enoughfor elastic inline processing As Zion utilizes a large pool of alreadystarted containers the start up time is typically negligible in manycases and only amounts to 09 secs if there are no free containersin the pool

129

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

001 0015 002 0025 003 0035 004

(a) Measured request times with Swift

001 0015 002 0025 003 0035 004

(b) Measured request times with Swift + Zion

Figure 6 Zion base overhead These boxplots show measuredtimes of requesting 10kB objects in Swift (a) and gettingthe same objects with a no-operation (nop) function troughZion (b)

Zion overhead As another basic experiment we studied the extraoverhead that Zion adds when running a function The overheadis the time needed by the interception middleware to redirect theobject to a compute node take an available container launch afunction inside the Docker and pass the object data through it Forthis experiment we made use of the function listed in Listing 1

This simple function iterates over the data without processingit For measuring the overhead we used a set of 10kB objects Wefirst launched 5 000 plain GET requests to measure the base timeneeded to complete a GET request Then we launched another5 000 GET requests for the same objects but in this case we set upthe function in Listing 1 to respond upon the onAfterGet triggerof the objects

The results are plotted in Fig 6 This figure shows that the Zionrsquosoverhead is 9 ms This time includes the penalty of 5 ms for theaddition of an internal hop to Swift (redirect the data through acompute node) plus 4 ms to determine whether and which functionto run inside a free Docker container in the pool when the requesttouches a compute node

Performance of collocated functions We evaluated here theperformance of our application functions For this evaluation weconducted a stress test for each combination of application objectsize and number of function workers As our storage nodes have 4CPU cores we ran each function on 1 2 3 and 4 function workersrespectively The Docker containers were set up as follows 1 CPUcore and 512MB of RAM per worker The results are depicted inFig 7-10

The first experiments were done collocating the containers inthe storage nodes without using the Zion computation layer Themain objective of this experiment was 1 To show how collocationlimits the number of TPS and 2 To show the performance of Zionrsquosfunctions using 1 CPU core per function worker As a side effectnotice that the latter makes it possible to faithfully generalize our

200 300 400 5000

500

1000

1500

2000

Respon

setime(m

s)

10 kB

1 worker 2 workers 3 workers 4 workers

200 300 400 5000

500

1000

1500

2000

100kB

40 60 80 100

TPS

0

500

1000

1500

Respon

setime(m

s)

1MB

5 10 15 20 25 30

TPS

0

1000

2000

3000

4000

10MB

Figure 7 Compression function performance

20 40 60 80 1000

500

1000

1500

2000

1 MB

5 10 15 20

Transactions per Second (TPS)

0

1000

2000

3000

Respon

setime(m

s)

10 MB

2 4 6 8 10

Transactions per Second (TPS)

0

2000

4000

6000

8000

100 MB

0 100 200 300 400 5000

500

1000

1500

2000

Respon

setime(m

s)

100 kB

1 worker 2 workers 3 workers 4 workers

Figure 8 Content-level access control function perfor-mance

results to any number of cores We empirically verified this for theimage resize use case As the cluster is made up of 6 storage nodesthe function with 1 worker was able to process around 9 images persecond for the 400kB image case and around 54 images per secondin the entire cluster For the rest of applications we utilized onlyone storage node The reason is that to tap into the performance ofthe entire cluster we would have to replicate 6 times each object toensure that a copy of it is available at every storage node In activestorage the total computation power for an object is proportionalto the number of replicas it has

Notice that in Fig 7- 10 the curves representing the responsetime start to increase when the resources of the Docker containerare overloaded ie when all the CPU cores are at the 100 of theircapacity These experiments also show how the object size is veryimportant In all experiments the higher the object size the lowerthe number of TPS a function can handle To wit for the content-level access control application (Fig 8) the function is unable tohandle more than 1 TPS for the 100MB object irrespective of thenumber of workers Concretely this task takes around 16 secondsto complete The same occurs with the image resizing function We

130

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

10 20 30 40 50 600

1000

2000

3000

4000

Respon

setime(m

s)

90 kB

1 worker 2 workers 3 workers 4 workers

10 20 30 400

500

1000

1500

2000

400 kB

5 10 15 20

Transactions per Second (TPS)

0

1000

2000

3000

4000

Respon

setime(m

s)

800 kB

2 4 6 8 10

Transactions per Second (TPS)

0

1

2

3

4times104

1200 kB

Figure 9 Image resizer function performance

100 150 200 250 3000

500

1000

1500

2000

Respon

setime(m

s)

10 kB

1 worker 2 workers 3 workers 4 workers

50 100 150 200 2500

500

1000

1500

2000

100 kB

20 40 60 80 100 120

Transactions per Second (TPS)

0

500

1000

1500

Respon

setime(m

s)

1 MB

5 10 15 20 25 30

Transactions per Second (TPS)

0

500

1000

1500

2000

10 MB

Figure 10 Signature verification function performance

saw that resizing an image of 1200kB takes around 236 secondsleading to lt 1 TPS

A function that needs more than 1 second to process an objectit can only handle in parallel as many objects as functions workersare running These examples show that function collocation at thestorage nodes would be in any case insufficient to absorb a burst oftransactions for more than one second Also they demonstrate thatit is almost impossible to predict resource consumption ahead oftime because resources depend on the object size Consequentlyresource management for collocated functions should be dynamicwhich is hard to achieve in practice instead of a simple resourcemanagement policy such as one single worker per CPU core

Performance of non-collocated functions As resource con-tention is not an issue with non-collocated functions what is keyhere is to verify that Zionrsquos storage-disaggregated functions areindeed scalable

In order to test how the Zionrsquos compute layer provides betterscalability than the storage nodes we extended the prior evaluationto non-collocated functions Our compute nodes have 24 cores eachone This means that is possible to start up to 24 function workers in

0 1 2 3 4 5 6 7 810

20

30

40

Max

TPS

90 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 85

10

15

20

25

400 kB

0 1 2 3 4 5 6 7 8

Workers

0

2

4

6

8

Max

TPS

800 kB

0 1 2 3 4 5 6 7 8

Workers

-1

0

1

2

3

1200 kB

Figure 11 Image resizer function scalability

0 1 2 3 4 5 6 7 8100

150

200

250

300

Max

TPS

10 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 850

100

150

200

250

100 kB

0 1 2 3 4 5 6 7 8

Workers

0

50

100

150

200

Max

TPS

1 MB

0 1 2 3 4 5 6 7 8

Workers

6

8

10

12

14

10 MB

Figure 12 Signature verification function scalability

each node However to make the comparative fair with the storagenodes we utilized between 1 to 8 function workers

As in the previous experiments we used 1 proxy and 1 storagenode in addition to 1 compute node which is enough to assessthe scalability of Zion The results are shown in Fig 13- 12 In thiscase we recorded the number of maximum transactions per second(Max TPS) that each worker was able to handle with and withoutcollocation

First we can see how in almost all cases with 1 to 4 functionworkers non-collocated functions can handle more transactionsthan the storage nodes This is due to the fact that the computenodes are more powerful than the storage nodes in terms of CPUTherefore the capacity of ingestion is slightly higher Second themaximum TPS that a storage node can handle is always limitedabove by the number of available cores That is spawning moreworkers has no benefit because they will contend for the availableresources This is the reason why the curve for collocated functions(gray line) flattens out beyond 4 workers in all use cases

However non-collocated functions (black line) can continue toprocess transactions by spawning more workers to respond to thedemand By a quick inspection of all the figures the scalability ofnon-collocated functions is practically linear with the number of

131

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

0 1 2 3 4 5 6 7 8200

400

600

800

1000

Max

TPS 10 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 80

200

400

600

800

100 kB

0 1 2 3 4 5 6 7 8

Workers

20

40

60

80

Max

TPS 1 MB

0 1 2 3 4 5 6 7 8

Workers

0

5

10

15

10 MB

Figure 13 Compression function scalability

0 1 2 3 4 5 6 7 80

200

400

600

Max

TPS

100 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 820

40

60

80

1 MB

0 1 2 3 4 5 6 7 8

Workers

0

5

10

Max

TPS

10 MB

0 1 2 3 4 5 6 7 8

Workers

0

1

2

3

100 MB

Figure 14 Content-level access control function scalability

workers This suggests that disaggregating storage and compute is apractical solution to scale out computation in cloud object stores

Interactive data queries Table 2 compares the execution timesfor the same query for Hadoop and Zion (Listing 2) The entries ofthis table were obtained by running 30X each configuration andthen averaging the results

Non-surprisingly the Hadoop configuration that ingests the datafrom the remote Swift service is the one that presents the highestmean execution time This occurred due to network contentionBoth clusters were geographically far apart so communicationwent through some FAST Ethernet links within our institutionrsquosLAN More concretely it took asymp 917 seconds to transfer the entire10GB dataset and asymp 86 seconds to transfer the 1GB dataset

The configuration that read the data from HDFS had better timesspecially for the large datasets However it has the disadvantagethat it requires pre-loading all the data in HDFS which may be notpractical in many cases

For Zion functions the execution times were comparativelysmall despite using at most 24 workers This suggests that bettertimes can be easily attained with more workers This is clearly seenwhen comparing Zion with 12 and 24 workers respectively For

Table 2 Interactive data queries execution times

Configuration File size Chunk size Time

Pig query - Swift 100 MB 10 MB 816sPig query - HDFS 100 MB 10 MB 714sZion - 12 Workers 100 MB 10 MB 0510sZion - 24 Workers 100 GB 10 MB 0348s

Pig query - Swift 1 GB 32 MB 156sPig query - HDFS 1 GB 32 MB 754sZion - 12 Workers 1 GB 32 MB 4183sZion - 24 Workers 1 GB 32 MB 2256s

Pig query - Swift 10 GB 200 MB 985sPig query - HDFS 10 GB 200 MB 946sZion - 12 Workers 10 GB 200 MB 26253sZion - 24 Workers 10 GB 200 MB 13392s

the 10GB dataset and 12 workers it took 2X much more time thanwith 24 workers Also this experiment confirms the scalability ofZion without incurring in resource contention in the storage clusterIf we had executed this interactive query with 24 workers in thestorage nodes the storage cluster CPU would had reached 100 inall storage nodes for 134 seconds leading to resource contention

Listing 2 Evaluated query in Hadoop and Swift + Zion clus-terss e l e c t u s e r _ i d count ( lowast ) t o t a l

where ( r e q _ t = GetContentResponse or r e q _ t = PutContentResponse )

and msg= Reques t done group by u s e r _ i d order by t o t a l DESC l imi t 10

7 CONCLUSIONThis paper presents Zion an innovative data-driven serverlesscomputing middleware for object storage Unlike commercial event-driven serverless models our data-driven functions intercept andoperate over objects as they are readwrite fromto the object storeSince Zion is serverless it overcomes the scalability and resourcecontention problems of active storage without the need to managea server or a runtime environment

By injecting computations in the data pipeline Zion is ideal foruse cases that require synchronous interactions with external usersExamples of these use cases include (but are not limited to) dynamiccontent generation interactive queries personalization contentverification and access control In many of them the data localityof our inline computations contributes to optimize latency and toreduce data transfers outside the data center

ACKNOWLEDGMENTSThis work has been partially funded by the European Union Horizon2020 Framework Programme in the context of the project IOStackSoftware-defined Storage for Big Data (H2020-644182) and by theSpanish Ministry of Science and Innovation (Grant TIN2016-77836-C2-1-R)

132

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

REFERENCES[1] Amazon 2017 Athena httpsawsamazoncomathena (2017)[2] Amazon 2017 Lambda httpsawsamazoncomlambda (2017)[3] Amazon 2017 Redshift Spectrum httpsawsamazoncomredshiftspectrum

(2017)[4] Apache 2012 HBASE Coprocessors httpsblogsapacheorghbaseentry

coprocessor_introduction (2012)[5] Apache 2017 Hadoop OpenStack Support Swift Object Store httpshadoop

apacheorgdocsstablehadoop-openstackindexhtml (2017)[6] Krste Asanovic and D Patterson 2014 Firebox A hardware building block for

2020 warehouse-scale computers In FAST[7] Prosunjit Biswas Farhan Patwa and Ravi Sandhu 2015 Content Level Access

Control for OpenStack Swift Storage In ACM CODASPY 123ndash126[8] Chao Chen Yong Chen and Philip C Roth 2012 Dosas Mitigating the resource

contention in active storage systems In CLUSTER 164ndash172[9] Facebook 2017 Presto Distributed SQL Engine for Big Data httpsprestodbio

(2017)[10] Sadjad Fouladi Riad S Wahby Brennan Shacklett Karthikeyan Vasuki Balasubra-

maniam William Zeng Rahul Bhalerao Anirudh Sivaraman George Porter andKeith Winstein 2017 Encoding Fast and Slow Low-Latency Video ProcessingUsing Thousands of Tiny Threads In NSDI

[11] Peter X Gao Akshay Narayan Sagar Karandikar Joao Carreira Sangjin HanRachit Agarwal Sylvia Ratnasamy and Scott Shenker 2016 Network require-ments for resource disaggregation In OSDI 249ndash264

[12] Google 2009 BigTable Coprocessors httpswwwcscornelleduprojectsladis2009talksdean-keynote-ladis2009pdf (2009)

[13] Google 2017 Cloud Functions httpscloudgooglecomfunctionsdocs (2017)[14] Rauacutel Gracia-Tinedo Pedro Garciacutea-Loacutepez Marc Saacutenchez-Artigas Josep Sampeacute

Yosef Moatti Eran Rom Dalit Naor Ramon Nou Toni Corteacutes Pietro Michiardiand William Oppermann 2016 IOSTACK Software-defined Object StorageIEEE Internet Computing (2016)

[15] Rauacutel Gracia-Tinedo Josep Sampeacute Edgar Zamora Marc Saacutenchez-Artigas PedroGarciacutea-Loacutepez Yosef Moatti and Eran Rom 2017 Crystal Software-DefinedStorage for Multi-Tenant Object Stores In FAST

[16] Rauacutel Gracia-Tinedo Yongchao Tian Josep Sampeacute Hamza Harkous John LentonPedro Garciacutea-Loacutepez Marc Saacutenchez-Artigas and Marko Vukolic 2015 DissectingUbuntuOne Autopsy of a Global-scale Personal Cloud Back-end In IMC 155ndash168

[17] Scott Hendrickson Stephen Sturdevant Tyler Harter VenkateshwaranVenkataramani Andrea C Arpaci-Dusseau and Remzi H Arpaci-Dusseau 2016Serverless computation with OpenLambda In HotCloud

[18] HP 2017 The Machine Our vision for the Future of Computing httpswwwlabshpecomthe-machine (2017)

[19] IBM 2017 OpenWhisk httpsdeveloperibmcomopenwhisk (2017)[20] Eric Jonas Shivaram Venkataraman Ion Stoica and Benjamin Recht 2017

Occupy the Cloud Distributed computing for the 99 arXiv preprintarXiv170204024 (2017)

[21] Microsoft 2017 Azure Functions httpsazuremicrosoftcomen-usservicesfunctions (2017)

[22] Netflix 2016 Evolution of the Netflix Data Pipeline httptechblognetflixcom201602evolution-of-netflix-data-pipelinehtml (2016)

[23] OpenStack 2017 Nova httpsdocsopenstackorgnova (2017)[24] OpenStack 2017 Swift httpsdocsopenstackorgswift (2017)[25] OpenStack 2017 Swift Large Objects httpsdocsopenstackorgswiftlatest

overview_large_objectshtml (2017)[26] Juan Piernas Jarek Nieplocha and Evan J Felix 2007 Evaluation of active storage

strategies for the lustre parallel file system In SC 28[27] Lingjun Qin and Dan Feng 2006 Active Storage Framework for Object-based

Storage Device In AINA 97ndash101[28] Simona Rabinovici-Cohen Ealan Henis John Marberg and Kenneth Nagin 2014

Storlet engine performing computations in cloud storage Technical Report IBMTechnical Report H-0320 (August 2014)

[29] Paul Rad Van Lindberg Jeff Prevost Weining Zhang and Mo Jamshidi 2014ZeroVM secure distributed processing for big data analytics In WAC 1ndash6

[30] Redis 2017 Redis httpsredisio (2017)[31] Erik Riedel Garth A Gibson and Christos Faloutsos 1998 Active Storage for

Large-Scale Data Mining and Multimedia In VLDB 62ndash73[32] Michael T Runde Wesley G Stevens Paul A Wortman and John A Chandy 2012

An active storage framework for object storage devices In MSST 1ndash12[33] Josep Sampeacute 2017 Zion framework httpsgithubcomJosepSampezion

(2017)[34] Josep Sampeacute Pedro Garciacutea-Loacutepez and Marc Saacutenchez-Artigas 2016 Vertigo

Programmable micro-controllers for software-defined object storage In IEEECLOUD 180ndash187

[35] Seung Woo Son Samuel Lang Philip Carns Robert Ross Rajeev Thakur BerkinOzisikyilmaz Prabhat Kumar Wei-Keng Liao and Alok Choudhary 2010 En-abling active storage on parallel IO software stacks In MSST 1ndash12

[36] Zhipeng Tan Yanli Yuan Dan Feng Tian Zhan and Xudong Tu 2011 Imple-mentation of method object in active object-based storage systems In CCIS204ndash211

[37] Ubuntu 2017 Cloud image repository httpscloud-imagesubuntucom (2017)[38] Ubuntu 2017 Docker repository httpshubdockercomr_ubuntu (2017)[39] Irvine University of California 1996 Adult Data Set httparchiveicsuciedu

mldatasetsAdult (1996)[40] Himanshu Vashishtha and Eleni Stroulia 2011 Enhancing query support in

hbase via an extended coprocessors framework In ServiceWave 75ndash87[41] Rajiv Wickremesinghe Jeffrey S Chase and Jeffrey Scott Vitter 2002 Distributed

computing with load-managed active storage In HPDC 13ndash23[42] Yulai Xie Dan Feng Yan Li and Darrell DE Long 2016 Oasis an active storage

framework for object storage platform Future Generation Computer Systems 56(2016) 746ndash758

133

  • Abstract
  • 1 Introduction
  • 2 Related Work
  • 3 Design Overview
    • 31 Interception Software and Metadata Service
    • 32 Computation Layer
      • 4 Implementation
        • 41 Interception Software and Metadata Service
        • 42 Computation Layer
          • 5 Applications
          • 6 Evaluation
            • 61 System Setup
            • 62 Swift Resource Contention
            • 63 Application Setup
            • 64 Application characteristics
            • 65 Results
              • 7 Conclusion
              • Acknowledgments
              • References
Page 2: Data-Driven Serverless Functions for Object Storagefaculty.washington.edu/wlloyd/courses/tcss562... · using AWS Lambda for data-intensive tasks over disaggregated storage [10, 20]

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

event-driven because our computations are located in the datapipeline and intercept the data flows that arrive and return fromthe object store

In practice there are many jobs that require functions to bepart of the data pipeline to provide optimal response times Zionrsquosspotlight is on these jobs use cases that require of synchronous in-teraction between customers and object storage Examples of theseuse cases include (but are not limited to) dynamic content genera-tion interactive queries content verification access control whichare better suited for a data-driven serverless computing model Mostof these use cases are hard or impossible to implement in currentplatforms (eg AWS Lambda) due to operational requirements forinstance because they need to transparently intercept incomingrequests to the object store or because they require interactivecommunications through the standard object storersquos API

In contrast our model is a lightweight solution that allows usersto create small stateless functions that intercept and run on dataflows in a scalable manner without the need to manage a server or aruntime environment We have coded a prototype implementationof Zion for OpenStack Swift where the serverless compute layerlies between the proxy and storage nodes This has allowed us tomaximize write and read performance per-worker to storage nodesWe show through several use cases how our solution scales up withminimum overhead and no resource contention

The contributions of this work are the following

(1) We confirm the scalability and resource contention issuesof collocation (compute close to the data) in object storageOur key insight is that when a cloud solution cannot affordto include a non-scalable computing layer collocated withthe storage nodes disaggregation is the answer

(2) We present the new notion of data-driven serverless com-puting Unlike previous offline asynchronous event-drivenapproaches our data-driven model is aimed for inline andsynchronous interactions with a data store

(3) We demonstrate the feasibility of our model with differentpractical use cases With free containers available Zionrsquosoverhead is of 9 ms which is well amortized by the func-tionsrsquo execution time typically in the second range for theaverage data-driven use cases

The remainder of this paper is organized as follows In Section2 we summarize the background and the related work and discusstheir advantages and shortcomings In Section 3 we describe thedesign of Zion In section 4 we show the implementation of Zion ona real system based on OpenStack Swift In section 5 we presentsome applications which our system supports Finally in section 6we evaluate Zion and the proposed applications

2 RELATEDWORKMoving computation close to the data to benefit from data localityis nowadays a commonplace idea In databases stored proceduresand co-processors [4 12 40] have standard interfaces and havealready been in use in many production environments

In the context of storage systems computation close to the datahas been studied as active storage Active storage [31] has ledto a plethora of research works proposing platforms for different

storage systems such as file systems [26 35] object stores [27ndash29 32 36 42] and software-defined storage architectures [14 15 34]Many of these works proved significant data transfer reductionsand optimizations derived from data locality

But active storage has inherently some downsides that precludeits general applicability in the Cloud 1 Compute tasks are lim-ited by the available resources at storage nodes which hindersscalability and 2 It requires of resource management to limit ac-cess to scarce or sensitive resources at storage nodes If resourcesare not properly managed active storage can then incur resourcecontention problems when computing tasks are heavy

Some works tried to solve the resource contention problems bylimiting the execution of active storage tasks or by delegating thecomputing tasks to external services [8 41 42] In any case theycannot solve the scalability problem that makes these systems lessattractive for elastic cloud settings

Even worse data locality is becoming less and less relevant dueto considerable advances in datacenter networking This makesof disaggregation an attractive principle for both storage [6 18]and in-memory systems [11] Following this trend highly scal-able serverless platforms such as AWS Lambda are becoming verypopular these days to run asynchronous computing tasks over dis-aggregated object storage like Amazon S3 Even recent works areusing AWS Lambda for data-intensive tasks over disaggregatedstorage [10 20]

However serverless platforms like AWS Lambda are mainlydesigned for asynchronous event-driven computing tasks over thedisaggregated storage resources For example when an image isuploaded to S3 the upload event will trigger a Lambda functionthat will then write to S3 asynchronously different versions of thesame resized image

Amazon offers the AWS API Gateway service as a mechanism tosynchronously call lambda functions which might read data fromS3 and output transformed content to the client This mechanismcan be useful for a number of applications but also it incurs extraoverhead as functions are not in the readwrite path fromto S3Further applications have to work with another API instead of withthe simple S3rsquos API In this sense AWS Lambda does not supportinterception of object storage requests and its corresponding inlineprocessing as Zion

Another related technology is Amazon LambdaEdge [2] WithLambdaEdge it is possible to intercept requests in CloudFront andeven modify the request and response However LambdaEdgehas stringent computing limitations concretely 3 seconds of ex-ecution time which reduces significantly the range of potentialapplications In general it has been designed for header and meta-data manipulation and for very lightweight computations and datatransformations

Even if it is not directly related work Amazon Athena [1] andRedshift Spectrum [3] or Facebook Presto [9] can also provide in-teractive queries over large scale object repositories like S3 In par-ticular Presto offers SQL interactive queries where all processing isin memory and pipelined across the network between stages Likein our model the pipelined execution avoids unnecessary IO and

122

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

Storage

API

StorageGateway

StorageNode

Hard D

isk

StorageNode

MetadataService

StorageGateway

StorageNode

Computation Layer

MiddleboxMiddleboxMiddleboxMiddleboxComputeNode

Storage Network

Hard D

iskH

ard Disk

Figure 1 High-level architecture overview of the proposed model This figure shows the basic components of the proposedarchitecture to run interception functions in an object storage system

associated latency overhead Unlike our model these systems can-not provide generic function computations since they are focusedon interactive queries

As we will explain in the rest of the paper our model enablesinterception and transformation of data flows directed to the objectstore We place a scalable compute layer in the data path betweenproxy and storage nodes to reduce latency for interactive commu-nications

3 DESIGN OVERVIEWZion has been designed for scalable data-driven execution of smallfunctions in object stores And thus all components revolve aroundthe object storage service Zionrsquos design therefore assumes that theunderlying object store follows the ldquoclassicalrdquo architecture of loadbalancers distributing the workload evenly across the gateways orproxies and a large pool of storage nodes which are in charge ofstoring the objects in the hard disks

As a first overview Fig 1 shows a diagram of Zionrsquos architec-ture To not interfere with both plain PUTGET requests and storagemanagement tasks such as replication and failover Zion lays out adisaggregated computing layer between the storage and gatewaynodes for executing the functions Zion also integrates a metadataservice and an interception software running in the storage gate-ways which inspects incoming requests and reroutes them to thecompute tier if necessary

31 Interception Software and Metadata ServiceThe first component of the system is the interception layer whichis integrated in the storage gateway (depicted as a router in Fig 1)The major aim of this software is to manage the deployment offunctions the association of triggers to these functions and theirexecution when a request matches a trigger

A trigger is a combination of a URL with prefix and suffix filterssimilar to AWS Lambda for Amazon S3 and a HTTP method (GETPUT POST and DELETE) This interception mechanism is enoughfor many use cases By specifying the suffix txt as a filter forinstance Zion can run a compression function to all GET requestsfor text objects The list of triggers is the following

bull onPut onPost and onDelete which cause the execu-tion of the associated function whenever a PUT POST orDELETE request is received respectively As an example

the onPut trigger can be useful to process an object beforeits write to the object store and even discard its storage asprocessing is part of the write path and not asynchronousas in AWS Lambda

bull onBeforeGet a function associated to this trigger is ranwhen a user performs a GET request to the storage serviceThis is one of the two cases associated to the GET requestsIn this case the function is executed before forwarding therequest to the storage node and hence the function cannotprocess the targeted data object However this trigger canbe useful in many use cases like HTTP headers processingURL rewriting temporarily redirects etc

bull onAfterGet which causes any function associated to thistrigger to run on an incoming GET request to the storageservice In this case the function intercepts the storagenodersquos response and therefore it can dynamically manip-ulate the objectrsquos content

The metadata service is the second architectural component ofZion Metadata for triggers is pre-processed and indexed efficientlyin the metadata service to guarantee a small O(1) request matchingoverhead of the order of microsecs If there is any match this layer isalso responsible for redirecting the input flow as the object is readto an available worker Non-intercepted data flows rapidly followthe default storage path and bypass the serverless compute layeras usual

It is worth to note here that the fact that a serverless executionmodel can quickly spawn new workers is what makes it possible tointercept and process data flows ldquoon-the-flyrdquo without collocationWith VMs interception will be significantly more difficult to achieveas VMs can take minutes to start

32 Computation LayerThe third architectural component is the serverless computationlayer The computation layer is a pool of containers which puts thefunctions into execution

321 Functions A function is the computation code unit whichcan process the data In our model functions are data-driven iethey are focused to intercept the data flow and process the datainline as the object comes-in or comes-out from the storage clusterBecause of this in our model the response time from the functions(time to first byte) must be fast to not affect the user experience

123

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

In addition to processing the data stream functions can storeinformation in a persistent way (eg an access counter the times-tamp of the last access the name of the user who is accessing etc)Concretely a function can take one or all of the following actionsafter intercepting a lifecycle request of an object

bull It can update the object metadata (eg an access counter)bull It can generate new requests to the object storage service

This includes GET an object for example a dependencyneeded to process the main stream PUT an object forexample PUT a subset of the main object as it is processedDelete an object and POST metadata to another object

bull It can generate new requests to other services (eg Rab-bit Mongodb etc) for example to store some relevantinformation extracted of the main stream

bull It can update the requestresponse headers andbull It can cancel or rewire the request to another object

Functions may make use of third-party library dependenciesin order to achieve a specific behavior Once developed functionsshould be packed with them within a TAR file Therefore in con-junction functions and their dependencies must be lightweight insuch a way to minimize the time needed to transfer the functionpackage from the storage system to the compute node

Once packed a function is uploaded as a regular object to theobject store An interesting feature of Zion is that it allows theuser to set up the CPU and memory requirements and a timeoutvalue for every function The timeout is the amount of time thesystem waits to receive the first byte of the functionrsquos output Ifthe function times out the request is automatically canceled Thisinformation allows functions to be configured differently in order tobetter manage certain jobs This information is not mandatory andZion has the last word assigning default values when necessary

Further Zionrsquos functions accept parameters Parameters can beexplicit or implicit Explicit parameters are provided as headers inthe request Implicit parameters are default parameters that a userspecifies ahead of time in the moment of associating a trigger witha function Explicit parameters take precedence over implicit onesin the case of collision As an example consider a function to resizean image and the image resolution as a parameter If no argumentwas passed in the request the implicit image resolution would betaken from the functionrsquos metadata or an error would be thrownaccordingly The same function can have different parameter setsfor different triggers

A final important remark is that two different functions cannotintercept the same data flow in Zion unless they do it in a pipelinefashion one after another which raises no consistency issues

322 Compute Node In a compute node functions are runinside isolated environments or containers

Containers Each function has its own Linux container in orderto not interfere with the other cloud functions A container with afunction running inside is called a worker A function may have zeroone or more workers running at the same time depending on theworkload In the traditional function model starting a new workertakes around 6 minus 7 seconds [17] One requirement of our model isthat functions have to start running as soon as possible for thisreason we leverage a ready-to-use pool of containers In any case

our experiments verify that starting a new container takes around09 seconds which is practical enough for many synchronous andnear-real-time applications After a certain period of time the idleworkers are stopped and the corresponding containers are recycledin order to better optimize resource consumption

Zion Service This service manages the requests forwarded fromthe interception layer (31) When a new request arrives it takes acontainer from the pool installs the libraries and the function codeand sends the execution command to the runtime As functions mayhave specific CPU and memory requirements this service is also incharge of establishing the resource limits to containers according totheir configuration parameters It also load balances the workloadacross already started workers and starts new workers if necessary

Runtime The runtime is integrated into the containers It acceptsfunctions for a specific programming language and puts them intoexecution Our prototype of Zion currently supports Java but otherlanguages such as Python would be easy to integrate

4 IMPLEMENTATIONWe have implemented a prototype of our serverless framework [33]on top of OpenStack Swift [24] Swift is a highly scalable object storethat can store a large amount of data through a RESTful HTTP APIsimilar to that of Amazon S3 The access path to an object consistsof exactly three elements accountcontainerobject

The object is the exact data input by the user while accounts andcontainers provide a way of grouping objects Nesting of accountsand containers is not supported

Swift is split into several components which include accountcontainer and object servers the latter responsible for handlingstorage and management of objects All of these servers are typi-cally deployed in the storage nodes themselves In addition proxyservers expose the Swift API to users and stream objects to andfrom the client upon request In Swift data replication and dataplacement is provided by rings Each ring is a storage policy thatcontains the storage nodes and the HDDs where to store the objects

We have decided to make Zion the less intrusive as possible Asa result the only modification in the base Swift architecture is aSwift middleware which intercepts the requests at the proxy sideThe other elements are decoupled of the main Swift architecturewhich makes Zion easier to deploy

41 Interception Software and Metadata ServiceIn Swift the simplest way to intercept requests is to create a Swiftmiddleware We built a new Swift interception middleware forZion to accomplish two primary tasks 1 The management of func-tion code deployment and libraries including the triggers thatcause the functions to be run and 2 Redirection of requests andresponses through the computation layer when they need to beprocessed by any function

Upon the assignment of a function a trigger header of the typeonPut onBeforeGet onAfterGet onPost and onDelete must beappointed to tell the framework which lifecycle events to inter-cept Zion uses Redis [30] a quick in-memory key-value store asdistributed metadata service to maintain this information To opti-mize request matching Redis is collocated with the proxy nodes

124

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

Recall that as part of the metadata Zion also includes configura-tion information for the functions such as the CPU and memoryrequirements and cancellation timeouts as we already discussedin the preceding section

Function assignment granularity Although OpenStack Swiftstores the objects in a flat name space is possible to simulate ahierarchical structure within a single container by adding forwardslash characters () in the object name Each folder between thecontainer and object names is called a pseudo-folder in Swift Forexample in the object name imageszionjpg the prefix imagesis the pseudo-folder

In Zion we take advantage of this logical hierarchy to enablefunction assignment at per-object pseudo-folder and container lev-els Moreover we also enable mappings at suffix level for exampleto run functions to all objects whose name ends with jpg

Function execution After the function assignment process whena request arrives for an object the Zionrsquos middleware with the helpof the metadata service checks if that request triggers the executionof a function For example if the request is a PUT the middlewarewill launch an onPut trigger for the request If there are functionsthat respond to this trigger the middleware will immediately for-ward the request to an available container Otherwise the requestwill follow the default readwrite path

Parallel processingwith functions In all object stores there is alimit in the maximum allowed object size In Swift this limit is 5GBTo bypass this limitation Swift uses a special object called StaticLarge Object [25] (SLO) SLOs are objects split into differentparts The user must upload these parts together with a specialobject called manifest which contains the location of the objectparts Getting an object is totally transparent to the users whomake a request to the manifest and the whole object is returned asif it was stored as a single object in Swift

This Swift SLO approach enables Zion to associate a differentfunction to the manifest and to the parts With this mechanism itis possible to create a highly parallel and distributed computationalsubstrate by executing a function to each dataset part and finallyby running a reduction function to the filtered parts In Section 5we present a Hadoop-like use case for object storage implementedwith our functions This is aligned with the recent trend of largescale data analytics with serverless computing [10 20]

42 Computation LayerIn our implementation the computation layer is composed by apool of compute nodes They are located between the proxies andthe storage nodes as shown in Fig 1 We note at this moment thatthe compute nodes assigned to Zion are not shared with OpenStackNova [23] they are exclusively managed by Zion

A general overview of how Zion operates is the following Eachfunction is run inside a separate Docker container what is called aworker At the time of this writing Zionrsquos runtime is Java-basedAnd consequently every function is run in a Java Virtual Machine(JVM) At a very high level a worker can be viewed as a containerrunning a specific function Every new invocation to the functionis handled by a new thread inside the JVM

421 Functions As the Zionrsquos runtime is based on Java Zionrsquosfunctions are also written in Java This means that functions areplain Java classes They have an entry point called invoke whichcontains the main code of the function and that it is called uponevery new request The invoke method has two arguments of typeContext and API which are made available automatically by theZionrsquos runtime on every new request

The Context encapsulates the access to the request headers theobjectrsquos metadata and the objectrsquos data stream It also includes alogger facility for logging the execution of the function The APIenables access to external services By default the isolation level ofDocker containers precludes functions from having any access toexternal resources Through the API a function can access to someexternal services such as RabbitMQ and Redis in addition to Swiftitself The API class is extensible and has been framed to facilitateexternal interaction to future services

Here we have the implementation code of a simple function thatiterates over the data with no further processing

Listing 1 A function that echoes the data passed to itpublic c l a s s Handler implements I F u n c t i o n

public void invoke ( Contex t c tx API a p i ) while ( ( d a t a = c t x o b j e c t s t ream read ( ) ) )

c t x o b j e c t s t ream w r i t e ( d a t a )

422 Compute Node The complete architecture for computenodes is depicted in Fig 2 and consists of

Containers We use Docker containers to sandbox functions sothey have neither direct network access nor system execution andthread creation capabilities nor access to the local file system Onlya minimum ephemeral hard disk space is provided under the tmpdirectory to be used by functions We used the Ubuntu 1604 Dockerimage [38] for our implementation

Zion Service The Zion service is a WSGI server whose missionis to spawn workers for processing the request forwarded by theinterception middleware Consequently it must be up and runningin all compute nodes because as described in 322 it is responsiblefor managing containers and executing functions When a functionis running this service waits until receiving a response from itThe waiting time is limited by the default systemrsquos timeout or by auser-defined timeout specified at deployment time

Runtime For the runtime we created a Docker image based onUbuntu 1604 with the Java runtime installed on it to rapidly spawnnew containers and execute functions

As stated in previous section each different function is run in aseparate container If the compute node does not have any workeravailable for that function the Zion service takes a Docker con-tainer from the pool of containers and loads the function code andlibraries Subsequent requests will then be able to reuse the sameworker when processing the same ldquohotrdquo function

One feature of our functions is that it is possible to modifythe objectrsquos metadata (keyvalue) via the Context By defaultSwift stores 3 copies of each object However Swift is an eventuallyconsistent system This means that at a given time the replicas

125

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

Zion Service

Java Runtime

Java VM

Function 1

Function 1Worker 1

Java Runtime

Java VM

Function 1

Function 1Worker 2

Java Runtime

Java VM

Function 2

Function 2Worker 1

ToFrom Proxy TF Storage Node

Java Runtime

Docker waiting for a new function worker

Figure 2 Zion Compute Node High level architectureoverview of the components of a Zion Compute Node

of an object may have different data and metadata As functionsare stateless ie there is no relation between the different functioninvocations even if they occur in the same worker modifying thelocal copy of an objectrsquos metadata implies waiting for Swift toupdate all the replicas This behavior is clearly undesirable in termsof functionrsquos parallelism

To mitigate this issue each function worker is directly attachedto a distributed metadata caching system based on Redis It is aninternal feature totally transparent to the users who develop func-tions When a function is executed it initially loads the objectrsquosmetadata into the cache and then performs all the metadata modi-fications over it If there are other requests to the same object theother function invocations will use the previously-loaded metadataWhen the function terminates it offloads the metadata from thecache to all objectrsquos replicas

With this distributed cache we ensure that the objectsrsquo metadatatouched by the functions is always consistent in all replicas evenif there are multiple requests at the same time With it functionscan even do atomic operations over the metadata fields

5 APPLICATIONSHere we show some of the applications that Zion can support andthe ease with which these applications can be built on top of SwiftAs expected the described use cases are not arbitrary they havebeen chosen to show the potential of our data-driven model and inparticular for synchronous interactions with the object store Allof them have been implemented over Swift and evaluated in thenext section

Content-level Access Control With Zion it is extremely easyto implement sophisticated forms of access control based upon thecontents of the objects themselves This example is ideal to showthe limitations of AWS Lambda First it clear that access controlrequires the interception of the GET requests which can only bedone indirectly with the help of the API Gateway service Secondthe processing of the objectrsquos content to satisfy the access controlpolicy must be done inline as the user needs a timely notificationabout the status of her request which again cannot be realizedwith AWS Lambda Finally Zionrsquos design enables access controlto leverage data locality and improve response time as functions

go through the contents of the object as it is read from the storagenode

Typically access control in object storage operates at the gran-ularity of containers and hence once an object is accessible tosome party he gets the full content of the object Swift also followsthis ldquoall or nothingrdquo approach where the access to objects insidea container is enforced through access control lists (ACLs) whichmay be insufficient when objects contain sensitive content

In the exercise to show another capability of our framework weshow how content level access control can be realized very easilyin Swift thanks to our function abstraction in a similar fashion butcleaner than [7] By ldquocontent levelrdquo we mean that Swift users will beable to access to certain parts of an object based on their credentialsTo give a concrete example consider the publicly available Adultdataset from the UCI Machine Learning Repository [39] which con-tains about 48 000 rows of census information Each row containsattributes like race sex and marital-status which combinedwith explicit identifiers such as the SSN1 that identify the recordholders may leak sensitive information about a given individualAs a result the records of this object should be accessed differ-ently depending upon the user role For instance while a ldquopoliceagentrdquo should be able to access to all fields SSN age educationmarital-status race sex relationship capital-gain andnative-country a ldquocensus analystrdquo could be restricted to get onlya smaller view age education capital-gain native-country

To implement this example we have linked a function to theonAfterGet object trigger to enforce content level access controlon the object adult_datacsv We have defined a simple accesspolicy that depending on the use role ldquopolice agentrdquoor ldquocensusanalystrdquo allows to get all the fields or just an smaller projectionview This simple access policy has been stored as implicit parameterof the function ie in a JSON formatted string uploaded when welinked the object with the function as explained in 321 When aGET request comes for the object adult_datacsv the proxy firstchecks the Swift ACL If the object is accessible by that user thefunction then reads the content level policy and filters the dataonly if the user has the appropriate role

Compression A typical data reduction task is the compressionof objects In general any dynamic content filtering that involvesinline transformations of the content is ideally suited for our data-driven model A data-driven task could compressdecompress a filedynamically ldquoon-the-flyrdquo As Zion acts directly on the data pipelineie as the object is readwrite to the object store the result willbe either latency improvement or space reduction depending uponwhether Zion intercepts a GET request or a PUT request

Here we will merely consider the case of compressing incomingobjects ldquoon-the-flyrdquo which means that upon a GET request by theuser the target object will have to undergo decompression Thecommon approach for doing so in Swift is to implement an ad-hoccompression middleware However this approach is problematicFirst by running compression on the storage nodes compressionis repeated as many times are replicas there are By offloading it tothe proxies we can disrupt Swiftrsquos normal operation under heavy

1As the Adult dataset does not contain explicit identifiers we added a random SSN toeach row using the Fake Name Generator

126

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

load since the proxies are responsible for looking up the locationof objects in the rings and routing the requests accordingly

With Zion we can easily write down a compression functionand execute it between the proxies and the storage nodes in a scal-able way without worrying about resources or repeating the taskmany times In addition to the possibility to intercept GET requestsZion has another advantage over AWS Lambda It is capable to runcompression over objects whose total compression time exceed thefive minutesrsquo limit This is because Zion cancels a request only ifthe time for the receipt of the first byte from the function exceeds atimeout value (see Section 321 for details) This model is ideal foroperations such as compression that can be run as data is readwritewhich have been the focus of active storage for a long time

To implement the compression function we utilized gzip andthen we mapped it to a container with the onPut trigger As suchall writes to this container will be compressed We did the reverseprocess for the onAfterGet trigger so that when a user requestedan object of this container she would get the original uncompressedversion To do so we made use of an implicit parameter to tell thefunction what to do either to compress or decompress That is forthe onPut trigger the implicit parameter was set to ldquocompressionrdquoFor the onAfterGet trigger the implicit parameter value was setto ldquodecompressionrdquo

Image processing One of the archetypal use cases of serverlesscomputing is that of image resizing for we found it interesting toevaluate it here It is ideal to show the potentials of asynchronousevent-based functions such as AWS Lambdas and also very usefulfor tenants that use the object store as a back-end for web imagestorage In this case when the images are uploaded a function istriggered resizing and creating all the needed images for examplefor the different possible devices that can request the main web(phone tablet PC etc)

We did the same with Zion and coded a function that resizes animage to an arbitrary percentage To intercept the PUT requestswe linked it to an onPut trigger and specified jpg as the suffixof the object name So all objects whose name ended with jpgwere rescaled As Zion allows to create new objects as part of thefunctionrsquos output the function stores the original object and itsresized version(s)

Because of interception one interesting feature of Zion is that itdoes not require to fully store an image before the generation of itsresized version(s) as it occurs with AWS Lambda and S3 Since it isdone ldquoon-the-flyrdquo prior to storage Zion saves storage bandwidthAlthough of not much concern at first glance this property is veryinteresting for concurrently processing a vast collection of imagesas in [10] because IO savings add up rapidly for a large amount ofobjects

Signature verification To demonstrate the versatility of Zionwe proposed the signature verification use case Online contentchecking is again a perfect use case for data-driven functions sinceit requires an immediate response to the client

The main objective of this function is to verify that the objects(documents) that are uploaded are signed by the user and to verifythat the signature is valid ie the documents are authentic WithZion is possible to do this task in near-real time and notify the userinstantly in the case of rejection Also in the case of rejection we

prevent the archival of an inauthentic document thereby savingstorage space For the same reasons as above it is readily evidentthat this use case cannot be implemented with AWS Lambdas

The scenario is as follows The user obtains the public key fromher RSA key pair and she uploads it to a public container in theobject storage system Then the user signs the document with theprivate key and uploads it with the signature in a special header tothe object storage system which puts into execution the functionand verifies the document Note that in this case the function usesan explicit parameter (signature) described above

To do so we implemented a signature verification function Thefunction is mapped with a container to the onPut trigger Thereforeall object PUTs to this container will be enforced The function firstgets the public key from the object store based on the user who isuploading the object Then it loads the document content and itverifies the signature If it is valid the document is stored otherwisethe document is rejected sending an error message to the user

Interactive queries and result aggregation Finally interactivequeries is a use case that perfectly matches our data-driven modelWhen we want to perform fast data queries over existing datarepositories our data-driven model avoids moving the entire datasetto a computing cluster

For example object storage services are commonly used toarchive data like log files Businesses that want to extract fast in-sights from these data repositories using big data analytic toolsmust choose between two strategies 1 Moving data to the com-putation cluster to take advantage from data locality or 2 Usinga connector to allow data analytic tools to read data directly fromthe remote object storage With Zion we offer a third strategyCompute in the storage cluster using functions that filter data andaggregate results inline

The case example we present is a top-k query on access logsof UbuntuOne [16] a personal cloud service We want to obtaina list of the most active users and the total number of requestseach user sent The implementation of this query in Zion has twofunctions The first one receives a fragment of the log file and filtersrequests logs maintaining a counter for each user thus exploitingparallelism The other function receives as input the partial countersof the various instances of the first function and it performs theaggregation and sorting in order to produce the desired result Toimplement this mapreduce-like example we tapped into SwiftrsquosSLO-based data partitioning (see Section 41 for further details)

6 EVALUATIONFor confirming our suspicions we first studied how Swift behaveswhen some computation tasks are collocated in the same storagenode We then ran micro-benchmarks to measure the behavior andthe overheads of our functions We did so by running standaloneexperiments and the applications discussed in Section 5

61 System SetupOur experimental testbed consisted of a host (or client) with 2VCPUsand 4GB RAM On the server side we deployed Zion in an 8-machines rack OpenStack Swift (ocata version) installation formedby 2 proxy nodes Dell PowerEdge R320 with 12GB RAM and 6storage nodes Dell PowerEdge R320 with 8GB RAM (each one with

127

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

06

12

18

24

3

Respon

seTim

e(secon

ds) 1 Worker

2 Workers

3 Workers

4 Workers

(a) Transaction response time

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

20

40

60

80

100

CP

U U

sage

()

1 Worker

2 Workers

3 Workers

4 Workers

(b) Storage Node CPU Usage

Figure 3 Swift measurementMeasured response times (a) and CPU consumption for a given storage node (b) for getting objectsat a certain rate (transactions per second or TPS) for different values of Swift workers

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(a) 1 Object Server worker

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(b) 2 Object Server workers

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(c) 3 Object Server workers

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(d) 4 Object Server workers

Figure 4 Swift interference measurement These plots show how for a given storage node the swift object storage service behavesfor different number of workers and for different CPU loads of interference (Intf)

4 CPU cores) At the compute side the computation layer is com-posed by 3 nodes Dell PowerEdge R430 with 32GB RAM (each onewith 24 CPU cores) All the rack machines ran Ubuntu Server 1404The client host ran Ubuntu 16041 CloudImage [37]

62 Swift Resource ContentionWe first studied how collocation of compute and data affects SwiftTo do so in ldquoidealrdquo conditions we restricted this experiment to asingle storage node2 For this measurement we used a single proxynode Dell PowerEdge R320 with 12GB RAM and 1 storage nodeDell PowerEdge R320 with 8GB RAM

Base Swift The first experiment consisted of evaluating how Swiftnormally works (Fig 3) For this experiment we stored a bunchof random 10kB files into Swift Next using the unix httperftool we ran distinct workloads each one differing in the numberof transactions per second (TPS) We measured the resultant per-transaction response time (Fig 3a) and CPU usage (Fig 3b) of thestorage node

Technically Swift uses workers to handle a workload In Swift aworker is nothing but a thread that accepts requests Each worker

2 We note that the results are easy to extrapolate to larger Swift deployments

normally accepts 1 024 concurrent requests but it is a config-urable parameter This simple experiment confirmed us that whena worker exhausts the 100 of its corersquos resources the responsetime steeply increases due to the queuing delays For instance with1 worker the corersquos usage reaches 100 around 380 TPS the pointbeyond which the requests start to accumulate as shown in Fig3a This effect can be alleviated by starting new Swift workers asseen in this figure However since storage nodes have 4 cores onlySwift is limited to 4 workers And therefore a storage node willeventually end up experiencing resource contention sooner or latereven if there are no computations collocated with the data

Collocated Computation We repeated the same experiment butcollocating computations with the storage service The results inFig 4 show how the Swiftrsquos processing capacity diminishes as afunction of the CPU usage borrowed from the collocated tasks Forexample if Swift was restricted to use only 1 worker for requestprocessing collocated tasks would have almost no impact on SwiftHowever as the number of Swift workers increase to match thenumber of CPU cores the resource contention begins to producenegative effects in the storage system due to the interference fromthe collocated computations

128

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

Swiftrsquos best practices recommend to use as many workers asCPU cores Under a heavy use of the CPU by the collocated tasksthis figure shows that Swift request processing would be severelydiminished For instance when the collocated tasks consumed 80of the total CPU Swift processing fell to a 35 in this experimentThis result suggests that the right path to go is to move computationsoutside the storage nodes which is what we did with Zion

63 Application SetupNow we describe the specific evaluation setup for our example usecases As object stores can contain a huge diversity of object typesand sizes [16] we set up heterogeneous workloads for all use casesin order to verify the behavior of Zion under different scenarios

Content-level access control For testing this application weused the dataset described in Section 5 but trimmed down to differ-ent object sizes 100kB 1MB 10MB and 100MB As stated beforethe dataset content is filtered out according to the type of userthat requested the object To this aim we used Swift user roles toreturn only specific fields to each user The function associated tothe onAfterGet trigger reads from an implicit parameter the JSONstring containing the allowed fields for a specific userrsquos role (egage education marital-status) and then returns them to theuser

Compression It is well-known that the compression ratio of ob-jects affects resource consumption Objects that contain only zeroswill be compressed more quickly and consume less resources thancompressing a binary object To get rid of this issue we chose touse text documents with a similar compression ratio of around60 The documents were of sizes 10kB 100kB 1MB and 10MBrespectively

Image processing As in the previous case we tried to find themost common scenario for testing this function We focused onthose image sizes commonly used in static web pages Finally weused different jpg files of 90kB 400kB 800kB and 1200 kB andwe set an implicit parameter so that the function resizes the imageto the 50 of its original size

Signature verification A sample usage of this function may be toverify official documents and forms in a public institution For thisexperiment we used text documents also of different sizes 10 kB100 kB 1 MB and 10 MB These documents are signed with a RSAprivate key The experiment operates on PUT requests verifyingthe correction of the signature

Interactive data queries and result aggregation For this usecase we used different sizes of the publicly available UbuntuOnersquoslog file [16] 100MB 1GB and 10GB respectively We comparedZionrsquos functions execution time to those obtained using HadoopWe built a 10-node Hadoop cluster of commodity workstations4-core i5 at 253 GHZ and 16 GB of RAM For this application weissued a Pig query against 1 The log files stored in HDFS and 2Swift using the Apache Hadoop-Swift connector [5] For Zion first-stage filtering functions we picked chunk sizes of 10MB 32MBand 200MB for the 100MB 1GB and 10GB files respectively

Table 1 Function information

Application LOC Function Size

Content-level Access Control asymp 29 27 kBCompression asymp 8 18 kBImage processing asymp 17 23 kBSignature verification asymp 43 29 kBInteractive data query asymp 203 6 kB

08 085 09 095 1

Startup Time (seconds)

0

30

60

90

120

150

Numof

Dokcers

Figure 5 Zion runtime startup time Time in seconds that aDocker with the Zion Java runtime takes to start

64 Application characteristicsTable 1 shows information of our four Zion applications The LOCcolumn gives the number of lines of code required to execute thefunction The third column gives the Function size used for eachapplication From this table it can be seen that our functions arevery lightweight for the proposed applications

65 Results

Docker characteristics The Java runtime and then the functionsare run inside Docker containers in our prototype Consequentlythe first validation to do was to assess the impact of starting ourruntime within Docker containers If the startup time was too largeit would hamper the inline processing capabilities of Zion Ourexperiments however revealed that this is not the case

For this experiment we utilized a pre-compiled Docker imagewith our Java runtime integrated in it The experiment consisted oflaunching 1 000 containers at different rates in the compute clusterand measure the startup time and memory along with the amountof memory for keeping the runtime up and running over time

Fig 5 shows the results of this experiment As depicted startinga new Docker with the Java runtime takes between 085 minus 095seconds Regarding RAM consumption we got that each containerconsumes around 35MB of memory These results show how ourruntime takes 5X times less to start in comparison with the AWSLambdarsquos runtime which proves that Zion is lightweight enoughfor elastic inline processing As Zion utilizes a large pool of alreadystarted containers the start up time is typically negligible in manycases and only amounts to 09 secs if there are no free containersin the pool

129

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

001 0015 002 0025 003 0035 004

(a) Measured request times with Swift

001 0015 002 0025 003 0035 004

(b) Measured request times with Swift + Zion

Figure 6 Zion base overhead These boxplots show measuredtimes of requesting 10kB objects in Swift (a) and gettingthe same objects with a no-operation (nop) function troughZion (b)

Zion overhead As another basic experiment we studied the extraoverhead that Zion adds when running a function The overheadis the time needed by the interception middleware to redirect theobject to a compute node take an available container launch afunction inside the Docker and pass the object data through it Forthis experiment we made use of the function listed in Listing 1

This simple function iterates over the data without processingit For measuring the overhead we used a set of 10kB objects Wefirst launched 5 000 plain GET requests to measure the base timeneeded to complete a GET request Then we launched another5 000 GET requests for the same objects but in this case we set upthe function in Listing 1 to respond upon the onAfterGet triggerof the objects

The results are plotted in Fig 6 This figure shows that the Zionrsquosoverhead is 9 ms This time includes the penalty of 5 ms for theaddition of an internal hop to Swift (redirect the data through acompute node) plus 4 ms to determine whether and which functionto run inside a free Docker container in the pool when the requesttouches a compute node

Performance of collocated functions We evaluated here theperformance of our application functions For this evaluation weconducted a stress test for each combination of application objectsize and number of function workers As our storage nodes have 4CPU cores we ran each function on 1 2 3 and 4 function workersrespectively The Docker containers were set up as follows 1 CPUcore and 512MB of RAM per worker The results are depicted inFig 7-10

The first experiments were done collocating the containers inthe storage nodes without using the Zion computation layer Themain objective of this experiment was 1 To show how collocationlimits the number of TPS and 2 To show the performance of Zionrsquosfunctions using 1 CPU core per function worker As a side effectnotice that the latter makes it possible to faithfully generalize our

200 300 400 5000

500

1000

1500

2000

Respon

setime(m

s)

10 kB

1 worker 2 workers 3 workers 4 workers

200 300 400 5000

500

1000

1500

2000

100kB

40 60 80 100

TPS

0

500

1000

1500

Respon

setime(m

s)

1MB

5 10 15 20 25 30

TPS

0

1000

2000

3000

4000

10MB

Figure 7 Compression function performance

20 40 60 80 1000

500

1000

1500

2000

1 MB

5 10 15 20

Transactions per Second (TPS)

0

1000

2000

3000

Respon

setime(m

s)

10 MB

2 4 6 8 10

Transactions per Second (TPS)

0

2000

4000

6000

8000

100 MB

0 100 200 300 400 5000

500

1000

1500

2000

Respon

setime(m

s)

100 kB

1 worker 2 workers 3 workers 4 workers

Figure 8 Content-level access control function perfor-mance

results to any number of cores We empirically verified this for theimage resize use case As the cluster is made up of 6 storage nodesthe function with 1 worker was able to process around 9 images persecond for the 400kB image case and around 54 images per secondin the entire cluster For the rest of applications we utilized onlyone storage node The reason is that to tap into the performance ofthe entire cluster we would have to replicate 6 times each object toensure that a copy of it is available at every storage node In activestorage the total computation power for an object is proportionalto the number of replicas it has

Notice that in Fig 7- 10 the curves representing the responsetime start to increase when the resources of the Docker containerare overloaded ie when all the CPU cores are at the 100 of theircapacity These experiments also show how the object size is veryimportant In all experiments the higher the object size the lowerthe number of TPS a function can handle To wit for the content-level access control application (Fig 8) the function is unable tohandle more than 1 TPS for the 100MB object irrespective of thenumber of workers Concretely this task takes around 16 secondsto complete The same occurs with the image resizing function We

130

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

10 20 30 40 50 600

1000

2000

3000

4000

Respon

setime(m

s)

90 kB

1 worker 2 workers 3 workers 4 workers

10 20 30 400

500

1000

1500

2000

400 kB

5 10 15 20

Transactions per Second (TPS)

0

1000

2000

3000

4000

Respon

setime(m

s)

800 kB

2 4 6 8 10

Transactions per Second (TPS)

0

1

2

3

4times104

1200 kB

Figure 9 Image resizer function performance

100 150 200 250 3000

500

1000

1500

2000

Respon

setime(m

s)

10 kB

1 worker 2 workers 3 workers 4 workers

50 100 150 200 2500

500

1000

1500

2000

100 kB

20 40 60 80 100 120

Transactions per Second (TPS)

0

500

1000

1500

Respon

setime(m

s)

1 MB

5 10 15 20 25 30

Transactions per Second (TPS)

0

500

1000

1500

2000

10 MB

Figure 10 Signature verification function performance

saw that resizing an image of 1200kB takes around 236 secondsleading to lt 1 TPS

A function that needs more than 1 second to process an objectit can only handle in parallel as many objects as functions workersare running These examples show that function collocation at thestorage nodes would be in any case insufficient to absorb a burst oftransactions for more than one second Also they demonstrate thatit is almost impossible to predict resource consumption ahead oftime because resources depend on the object size Consequentlyresource management for collocated functions should be dynamicwhich is hard to achieve in practice instead of a simple resourcemanagement policy such as one single worker per CPU core

Performance of non-collocated functions As resource con-tention is not an issue with non-collocated functions what is keyhere is to verify that Zionrsquos storage-disaggregated functions areindeed scalable

In order to test how the Zionrsquos compute layer provides betterscalability than the storage nodes we extended the prior evaluationto non-collocated functions Our compute nodes have 24 cores eachone This means that is possible to start up to 24 function workers in

0 1 2 3 4 5 6 7 810

20

30

40

Max

TPS

90 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 85

10

15

20

25

400 kB

0 1 2 3 4 5 6 7 8

Workers

0

2

4

6

8

Max

TPS

800 kB

0 1 2 3 4 5 6 7 8

Workers

-1

0

1

2

3

1200 kB

Figure 11 Image resizer function scalability

0 1 2 3 4 5 6 7 8100

150

200

250

300

Max

TPS

10 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 850

100

150

200

250

100 kB

0 1 2 3 4 5 6 7 8

Workers

0

50

100

150

200

Max

TPS

1 MB

0 1 2 3 4 5 6 7 8

Workers

6

8

10

12

14

10 MB

Figure 12 Signature verification function scalability

each node However to make the comparative fair with the storagenodes we utilized between 1 to 8 function workers

As in the previous experiments we used 1 proxy and 1 storagenode in addition to 1 compute node which is enough to assessthe scalability of Zion The results are shown in Fig 13- 12 In thiscase we recorded the number of maximum transactions per second(Max TPS) that each worker was able to handle with and withoutcollocation

First we can see how in almost all cases with 1 to 4 functionworkers non-collocated functions can handle more transactionsthan the storage nodes This is due to the fact that the computenodes are more powerful than the storage nodes in terms of CPUTherefore the capacity of ingestion is slightly higher Second themaximum TPS that a storage node can handle is always limitedabove by the number of available cores That is spawning moreworkers has no benefit because they will contend for the availableresources This is the reason why the curve for collocated functions(gray line) flattens out beyond 4 workers in all use cases

However non-collocated functions (black line) can continue toprocess transactions by spawning more workers to respond to thedemand By a quick inspection of all the figures the scalability ofnon-collocated functions is practically linear with the number of

131

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

0 1 2 3 4 5 6 7 8200

400

600

800

1000

Max

TPS 10 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 80

200

400

600

800

100 kB

0 1 2 3 4 5 6 7 8

Workers

20

40

60

80

Max

TPS 1 MB

0 1 2 3 4 5 6 7 8

Workers

0

5

10

15

10 MB

Figure 13 Compression function scalability

0 1 2 3 4 5 6 7 80

200

400

600

Max

TPS

100 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 820

40

60

80

1 MB

0 1 2 3 4 5 6 7 8

Workers

0

5

10

Max

TPS

10 MB

0 1 2 3 4 5 6 7 8

Workers

0

1

2

3

100 MB

Figure 14 Content-level access control function scalability

workers This suggests that disaggregating storage and compute is apractical solution to scale out computation in cloud object stores

Interactive data queries Table 2 compares the execution timesfor the same query for Hadoop and Zion (Listing 2) The entries ofthis table were obtained by running 30X each configuration andthen averaging the results

Non-surprisingly the Hadoop configuration that ingests the datafrom the remote Swift service is the one that presents the highestmean execution time This occurred due to network contentionBoth clusters were geographically far apart so communicationwent through some FAST Ethernet links within our institutionrsquosLAN More concretely it took asymp 917 seconds to transfer the entire10GB dataset and asymp 86 seconds to transfer the 1GB dataset

The configuration that read the data from HDFS had better timesspecially for the large datasets However it has the disadvantagethat it requires pre-loading all the data in HDFS which may be notpractical in many cases

For Zion functions the execution times were comparativelysmall despite using at most 24 workers This suggests that bettertimes can be easily attained with more workers This is clearly seenwhen comparing Zion with 12 and 24 workers respectively For

Table 2 Interactive data queries execution times

Configuration File size Chunk size Time

Pig query - Swift 100 MB 10 MB 816sPig query - HDFS 100 MB 10 MB 714sZion - 12 Workers 100 MB 10 MB 0510sZion - 24 Workers 100 GB 10 MB 0348s

Pig query - Swift 1 GB 32 MB 156sPig query - HDFS 1 GB 32 MB 754sZion - 12 Workers 1 GB 32 MB 4183sZion - 24 Workers 1 GB 32 MB 2256s

Pig query - Swift 10 GB 200 MB 985sPig query - HDFS 10 GB 200 MB 946sZion - 12 Workers 10 GB 200 MB 26253sZion - 24 Workers 10 GB 200 MB 13392s

the 10GB dataset and 12 workers it took 2X much more time thanwith 24 workers Also this experiment confirms the scalability ofZion without incurring in resource contention in the storage clusterIf we had executed this interactive query with 24 workers in thestorage nodes the storage cluster CPU would had reached 100 inall storage nodes for 134 seconds leading to resource contention

Listing 2 Evaluated query in Hadoop and Swift + Zion clus-terss e l e c t u s e r _ i d count ( lowast ) t o t a l

where ( r e q _ t = GetContentResponse or r e q _ t = PutContentResponse )

and msg= Reques t done group by u s e r _ i d order by t o t a l DESC l imi t 10

7 CONCLUSIONThis paper presents Zion an innovative data-driven serverlesscomputing middleware for object storage Unlike commercial event-driven serverless models our data-driven functions intercept andoperate over objects as they are readwrite fromto the object storeSince Zion is serverless it overcomes the scalability and resourcecontention problems of active storage without the need to managea server or a runtime environment

By injecting computations in the data pipeline Zion is ideal foruse cases that require synchronous interactions with external usersExamples of these use cases include (but are not limited to) dynamiccontent generation interactive queries personalization contentverification and access control In many of them the data localityof our inline computations contributes to optimize latency and toreduce data transfers outside the data center

ACKNOWLEDGMENTSThis work has been partially funded by the European Union Horizon2020 Framework Programme in the context of the project IOStackSoftware-defined Storage for Big Data (H2020-644182) and by theSpanish Ministry of Science and Innovation (Grant TIN2016-77836-C2-1-R)

132

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

REFERENCES[1] Amazon 2017 Athena httpsawsamazoncomathena (2017)[2] Amazon 2017 Lambda httpsawsamazoncomlambda (2017)[3] Amazon 2017 Redshift Spectrum httpsawsamazoncomredshiftspectrum

(2017)[4] Apache 2012 HBASE Coprocessors httpsblogsapacheorghbaseentry

coprocessor_introduction (2012)[5] Apache 2017 Hadoop OpenStack Support Swift Object Store httpshadoop

apacheorgdocsstablehadoop-openstackindexhtml (2017)[6] Krste Asanovic and D Patterson 2014 Firebox A hardware building block for

2020 warehouse-scale computers In FAST[7] Prosunjit Biswas Farhan Patwa and Ravi Sandhu 2015 Content Level Access

Control for OpenStack Swift Storage In ACM CODASPY 123ndash126[8] Chao Chen Yong Chen and Philip C Roth 2012 Dosas Mitigating the resource

contention in active storage systems In CLUSTER 164ndash172[9] Facebook 2017 Presto Distributed SQL Engine for Big Data httpsprestodbio

(2017)[10] Sadjad Fouladi Riad S Wahby Brennan Shacklett Karthikeyan Vasuki Balasubra-

maniam William Zeng Rahul Bhalerao Anirudh Sivaraman George Porter andKeith Winstein 2017 Encoding Fast and Slow Low-Latency Video ProcessingUsing Thousands of Tiny Threads In NSDI

[11] Peter X Gao Akshay Narayan Sagar Karandikar Joao Carreira Sangjin HanRachit Agarwal Sylvia Ratnasamy and Scott Shenker 2016 Network require-ments for resource disaggregation In OSDI 249ndash264

[12] Google 2009 BigTable Coprocessors httpswwwcscornelleduprojectsladis2009talksdean-keynote-ladis2009pdf (2009)

[13] Google 2017 Cloud Functions httpscloudgooglecomfunctionsdocs (2017)[14] Rauacutel Gracia-Tinedo Pedro Garciacutea-Loacutepez Marc Saacutenchez-Artigas Josep Sampeacute

Yosef Moatti Eran Rom Dalit Naor Ramon Nou Toni Corteacutes Pietro Michiardiand William Oppermann 2016 IOSTACK Software-defined Object StorageIEEE Internet Computing (2016)

[15] Rauacutel Gracia-Tinedo Josep Sampeacute Edgar Zamora Marc Saacutenchez-Artigas PedroGarciacutea-Loacutepez Yosef Moatti and Eran Rom 2017 Crystal Software-DefinedStorage for Multi-Tenant Object Stores In FAST

[16] Rauacutel Gracia-Tinedo Yongchao Tian Josep Sampeacute Hamza Harkous John LentonPedro Garciacutea-Loacutepez Marc Saacutenchez-Artigas and Marko Vukolic 2015 DissectingUbuntuOne Autopsy of a Global-scale Personal Cloud Back-end In IMC 155ndash168

[17] Scott Hendrickson Stephen Sturdevant Tyler Harter VenkateshwaranVenkataramani Andrea C Arpaci-Dusseau and Remzi H Arpaci-Dusseau 2016Serverless computation with OpenLambda In HotCloud

[18] HP 2017 The Machine Our vision for the Future of Computing httpswwwlabshpecomthe-machine (2017)

[19] IBM 2017 OpenWhisk httpsdeveloperibmcomopenwhisk (2017)[20] Eric Jonas Shivaram Venkataraman Ion Stoica and Benjamin Recht 2017

Occupy the Cloud Distributed computing for the 99 arXiv preprintarXiv170204024 (2017)

[21] Microsoft 2017 Azure Functions httpsazuremicrosoftcomen-usservicesfunctions (2017)

[22] Netflix 2016 Evolution of the Netflix Data Pipeline httptechblognetflixcom201602evolution-of-netflix-data-pipelinehtml (2016)

[23] OpenStack 2017 Nova httpsdocsopenstackorgnova (2017)[24] OpenStack 2017 Swift httpsdocsopenstackorgswift (2017)[25] OpenStack 2017 Swift Large Objects httpsdocsopenstackorgswiftlatest

overview_large_objectshtml (2017)[26] Juan Piernas Jarek Nieplocha and Evan J Felix 2007 Evaluation of active storage

strategies for the lustre parallel file system In SC 28[27] Lingjun Qin and Dan Feng 2006 Active Storage Framework for Object-based

Storage Device In AINA 97ndash101[28] Simona Rabinovici-Cohen Ealan Henis John Marberg and Kenneth Nagin 2014

Storlet engine performing computations in cloud storage Technical Report IBMTechnical Report H-0320 (August 2014)

[29] Paul Rad Van Lindberg Jeff Prevost Weining Zhang and Mo Jamshidi 2014ZeroVM secure distributed processing for big data analytics In WAC 1ndash6

[30] Redis 2017 Redis httpsredisio (2017)[31] Erik Riedel Garth A Gibson and Christos Faloutsos 1998 Active Storage for

Large-Scale Data Mining and Multimedia In VLDB 62ndash73[32] Michael T Runde Wesley G Stevens Paul A Wortman and John A Chandy 2012

An active storage framework for object storage devices In MSST 1ndash12[33] Josep Sampeacute 2017 Zion framework httpsgithubcomJosepSampezion

(2017)[34] Josep Sampeacute Pedro Garciacutea-Loacutepez and Marc Saacutenchez-Artigas 2016 Vertigo

Programmable micro-controllers for software-defined object storage In IEEECLOUD 180ndash187

[35] Seung Woo Son Samuel Lang Philip Carns Robert Ross Rajeev Thakur BerkinOzisikyilmaz Prabhat Kumar Wei-Keng Liao and Alok Choudhary 2010 En-abling active storage on parallel IO software stacks In MSST 1ndash12

[36] Zhipeng Tan Yanli Yuan Dan Feng Tian Zhan and Xudong Tu 2011 Imple-mentation of method object in active object-based storage systems In CCIS204ndash211

[37] Ubuntu 2017 Cloud image repository httpscloud-imagesubuntucom (2017)[38] Ubuntu 2017 Docker repository httpshubdockercomr_ubuntu (2017)[39] Irvine University of California 1996 Adult Data Set httparchiveicsuciedu

mldatasetsAdult (1996)[40] Himanshu Vashishtha and Eleni Stroulia 2011 Enhancing query support in

hbase via an extended coprocessors framework In ServiceWave 75ndash87[41] Rajiv Wickremesinghe Jeffrey S Chase and Jeffrey Scott Vitter 2002 Distributed

computing with load-managed active storage In HPDC 13ndash23[42] Yulai Xie Dan Feng Yan Li and Darrell DE Long 2016 Oasis an active storage

framework for object storage platform Future Generation Computer Systems 56(2016) 746ndash758

133

  • Abstract
  • 1 Introduction
  • 2 Related Work
  • 3 Design Overview
    • 31 Interception Software and Metadata Service
    • 32 Computation Layer
      • 4 Implementation
        • 41 Interception Software and Metadata Service
        • 42 Computation Layer
          • 5 Applications
          • 6 Evaluation
            • 61 System Setup
            • 62 Swift Resource Contention
            • 63 Application Setup
            • 64 Application characteristics
            • 65 Results
              • 7 Conclusion
              • Acknowledgments
              • References
Page 3: Data-Driven Serverless Functions for Object Storagefaculty.washington.edu/wlloyd/courses/tcss562... · using AWS Lambda for data-intensive tasks over disaggregated storage [10, 20]

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

Storage

API

StorageGateway

StorageNode

Hard D

isk

StorageNode

MetadataService

StorageGateway

StorageNode

Computation Layer

MiddleboxMiddleboxMiddleboxMiddleboxComputeNode

Storage Network

Hard D

iskH

ard Disk

Figure 1 High-level architecture overview of the proposed model This figure shows the basic components of the proposedarchitecture to run interception functions in an object storage system

associated latency overhead Unlike our model these systems can-not provide generic function computations since they are focusedon interactive queries

As we will explain in the rest of the paper our model enablesinterception and transformation of data flows directed to the objectstore We place a scalable compute layer in the data path betweenproxy and storage nodes to reduce latency for interactive commu-nications

3 DESIGN OVERVIEWZion has been designed for scalable data-driven execution of smallfunctions in object stores And thus all components revolve aroundthe object storage service Zionrsquos design therefore assumes that theunderlying object store follows the ldquoclassicalrdquo architecture of loadbalancers distributing the workload evenly across the gateways orproxies and a large pool of storage nodes which are in charge ofstoring the objects in the hard disks

As a first overview Fig 1 shows a diagram of Zionrsquos architec-ture To not interfere with both plain PUTGET requests and storagemanagement tasks such as replication and failover Zion lays out adisaggregated computing layer between the storage and gatewaynodes for executing the functions Zion also integrates a metadataservice and an interception software running in the storage gate-ways which inspects incoming requests and reroutes them to thecompute tier if necessary

31 Interception Software and Metadata ServiceThe first component of the system is the interception layer whichis integrated in the storage gateway (depicted as a router in Fig 1)The major aim of this software is to manage the deployment offunctions the association of triggers to these functions and theirexecution when a request matches a trigger

A trigger is a combination of a URL with prefix and suffix filterssimilar to AWS Lambda for Amazon S3 and a HTTP method (GETPUT POST and DELETE) This interception mechanism is enoughfor many use cases By specifying the suffix txt as a filter forinstance Zion can run a compression function to all GET requestsfor text objects The list of triggers is the following

bull onPut onPost and onDelete which cause the execu-tion of the associated function whenever a PUT POST orDELETE request is received respectively As an example

the onPut trigger can be useful to process an object beforeits write to the object store and even discard its storage asprocessing is part of the write path and not asynchronousas in AWS Lambda

bull onBeforeGet a function associated to this trigger is ranwhen a user performs a GET request to the storage serviceThis is one of the two cases associated to the GET requestsIn this case the function is executed before forwarding therequest to the storage node and hence the function cannotprocess the targeted data object However this trigger canbe useful in many use cases like HTTP headers processingURL rewriting temporarily redirects etc

bull onAfterGet which causes any function associated to thistrigger to run on an incoming GET request to the storageservice In this case the function intercepts the storagenodersquos response and therefore it can dynamically manip-ulate the objectrsquos content

The metadata service is the second architectural component ofZion Metadata for triggers is pre-processed and indexed efficientlyin the metadata service to guarantee a small O(1) request matchingoverhead of the order of microsecs If there is any match this layer isalso responsible for redirecting the input flow as the object is readto an available worker Non-intercepted data flows rapidly followthe default storage path and bypass the serverless compute layeras usual

It is worth to note here that the fact that a serverless executionmodel can quickly spawn new workers is what makes it possible tointercept and process data flows ldquoon-the-flyrdquo without collocationWith VMs interception will be significantly more difficult to achieveas VMs can take minutes to start

32 Computation LayerThe third architectural component is the serverless computationlayer The computation layer is a pool of containers which puts thefunctions into execution

321 Functions A function is the computation code unit whichcan process the data In our model functions are data-driven iethey are focused to intercept the data flow and process the datainline as the object comes-in or comes-out from the storage clusterBecause of this in our model the response time from the functions(time to first byte) must be fast to not affect the user experience

123

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

In addition to processing the data stream functions can storeinformation in a persistent way (eg an access counter the times-tamp of the last access the name of the user who is accessing etc)Concretely a function can take one or all of the following actionsafter intercepting a lifecycle request of an object

bull It can update the object metadata (eg an access counter)bull It can generate new requests to the object storage service

This includes GET an object for example a dependencyneeded to process the main stream PUT an object forexample PUT a subset of the main object as it is processedDelete an object and POST metadata to another object

bull It can generate new requests to other services (eg Rab-bit Mongodb etc) for example to store some relevantinformation extracted of the main stream

bull It can update the requestresponse headers andbull It can cancel or rewire the request to another object

Functions may make use of third-party library dependenciesin order to achieve a specific behavior Once developed functionsshould be packed with them within a TAR file Therefore in con-junction functions and their dependencies must be lightweight insuch a way to minimize the time needed to transfer the functionpackage from the storage system to the compute node

Once packed a function is uploaded as a regular object to theobject store An interesting feature of Zion is that it allows theuser to set up the CPU and memory requirements and a timeoutvalue for every function The timeout is the amount of time thesystem waits to receive the first byte of the functionrsquos output Ifthe function times out the request is automatically canceled Thisinformation allows functions to be configured differently in order tobetter manage certain jobs This information is not mandatory andZion has the last word assigning default values when necessary

Further Zionrsquos functions accept parameters Parameters can beexplicit or implicit Explicit parameters are provided as headers inthe request Implicit parameters are default parameters that a userspecifies ahead of time in the moment of associating a trigger witha function Explicit parameters take precedence over implicit onesin the case of collision As an example consider a function to resizean image and the image resolution as a parameter If no argumentwas passed in the request the implicit image resolution would betaken from the functionrsquos metadata or an error would be thrownaccordingly The same function can have different parameter setsfor different triggers

A final important remark is that two different functions cannotintercept the same data flow in Zion unless they do it in a pipelinefashion one after another which raises no consistency issues

322 Compute Node In a compute node functions are runinside isolated environments or containers

Containers Each function has its own Linux container in orderto not interfere with the other cloud functions A container with afunction running inside is called a worker A function may have zeroone or more workers running at the same time depending on theworkload In the traditional function model starting a new workertakes around 6 minus 7 seconds [17] One requirement of our model isthat functions have to start running as soon as possible for thisreason we leverage a ready-to-use pool of containers In any case

our experiments verify that starting a new container takes around09 seconds which is practical enough for many synchronous andnear-real-time applications After a certain period of time the idleworkers are stopped and the corresponding containers are recycledin order to better optimize resource consumption

Zion Service This service manages the requests forwarded fromthe interception layer (31) When a new request arrives it takes acontainer from the pool installs the libraries and the function codeand sends the execution command to the runtime As functions mayhave specific CPU and memory requirements this service is also incharge of establishing the resource limits to containers according totheir configuration parameters It also load balances the workloadacross already started workers and starts new workers if necessary

Runtime The runtime is integrated into the containers It acceptsfunctions for a specific programming language and puts them intoexecution Our prototype of Zion currently supports Java but otherlanguages such as Python would be easy to integrate

4 IMPLEMENTATIONWe have implemented a prototype of our serverless framework [33]on top of OpenStack Swift [24] Swift is a highly scalable object storethat can store a large amount of data through a RESTful HTTP APIsimilar to that of Amazon S3 The access path to an object consistsof exactly three elements accountcontainerobject

The object is the exact data input by the user while accounts andcontainers provide a way of grouping objects Nesting of accountsand containers is not supported

Swift is split into several components which include accountcontainer and object servers the latter responsible for handlingstorage and management of objects All of these servers are typi-cally deployed in the storage nodes themselves In addition proxyservers expose the Swift API to users and stream objects to andfrom the client upon request In Swift data replication and dataplacement is provided by rings Each ring is a storage policy thatcontains the storage nodes and the HDDs where to store the objects

We have decided to make Zion the less intrusive as possible Asa result the only modification in the base Swift architecture is aSwift middleware which intercepts the requests at the proxy sideThe other elements are decoupled of the main Swift architecturewhich makes Zion easier to deploy

41 Interception Software and Metadata ServiceIn Swift the simplest way to intercept requests is to create a Swiftmiddleware We built a new Swift interception middleware forZion to accomplish two primary tasks 1 The management of func-tion code deployment and libraries including the triggers thatcause the functions to be run and 2 Redirection of requests andresponses through the computation layer when they need to beprocessed by any function

Upon the assignment of a function a trigger header of the typeonPut onBeforeGet onAfterGet onPost and onDelete must beappointed to tell the framework which lifecycle events to inter-cept Zion uses Redis [30] a quick in-memory key-value store asdistributed metadata service to maintain this information To opti-mize request matching Redis is collocated with the proxy nodes

124

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

Recall that as part of the metadata Zion also includes configura-tion information for the functions such as the CPU and memoryrequirements and cancellation timeouts as we already discussedin the preceding section

Function assignment granularity Although OpenStack Swiftstores the objects in a flat name space is possible to simulate ahierarchical structure within a single container by adding forwardslash characters () in the object name Each folder between thecontainer and object names is called a pseudo-folder in Swift Forexample in the object name imageszionjpg the prefix imagesis the pseudo-folder

In Zion we take advantage of this logical hierarchy to enablefunction assignment at per-object pseudo-folder and container lev-els Moreover we also enable mappings at suffix level for exampleto run functions to all objects whose name ends with jpg

Function execution After the function assignment process whena request arrives for an object the Zionrsquos middleware with the helpof the metadata service checks if that request triggers the executionof a function For example if the request is a PUT the middlewarewill launch an onPut trigger for the request If there are functionsthat respond to this trigger the middleware will immediately for-ward the request to an available container Otherwise the requestwill follow the default readwrite path

Parallel processingwith functions In all object stores there is alimit in the maximum allowed object size In Swift this limit is 5GBTo bypass this limitation Swift uses a special object called StaticLarge Object [25] (SLO) SLOs are objects split into differentparts The user must upload these parts together with a specialobject called manifest which contains the location of the objectparts Getting an object is totally transparent to the users whomake a request to the manifest and the whole object is returned asif it was stored as a single object in Swift

This Swift SLO approach enables Zion to associate a differentfunction to the manifest and to the parts With this mechanism itis possible to create a highly parallel and distributed computationalsubstrate by executing a function to each dataset part and finallyby running a reduction function to the filtered parts In Section 5we present a Hadoop-like use case for object storage implementedwith our functions This is aligned with the recent trend of largescale data analytics with serverless computing [10 20]

42 Computation LayerIn our implementation the computation layer is composed by apool of compute nodes They are located between the proxies andthe storage nodes as shown in Fig 1 We note at this moment thatthe compute nodes assigned to Zion are not shared with OpenStackNova [23] they are exclusively managed by Zion

A general overview of how Zion operates is the following Eachfunction is run inside a separate Docker container what is called aworker At the time of this writing Zionrsquos runtime is Java-basedAnd consequently every function is run in a Java Virtual Machine(JVM) At a very high level a worker can be viewed as a containerrunning a specific function Every new invocation to the functionis handled by a new thread inside the JVM

421 Functions As the Zionrsquos runtime is based on Java Zionrsquosfunctions are also written in Java This means that functions areplain Java classes They have an entry point called invoke whichcontains the main code of the function and that it is called uponevery new request The invoke method has two arguments of typeContext and API which are made available automatically by theZionrsquos runtime on every new request

The Context encapsulates the access to the request headers theobjectrsquos metadata and the objectrsquos data stream It also includes alogger facility for logging the execution of the function The APIenables access to external services By default the isolation level ofDocker containers precludes functions from having any access toexternal resources Through the API a function can access to someexternal services such as RabbitMQ and Redis in addition to Swiftitself The API class is extensible and has been framed to facilitateexternal interaction to future services

Here we have the implementation code of a simple function thatiterates over the data with no further processing

Listing 1 A function that echoes the data passed to itpublic c l a s s Handler implements I F u n c t i o n

public void invoke ( Contex t c tx API a p i ) while ( ( d a t a = c t x o b j e c t s t ream read ( ) ) )

c t x o b j e c t s t ream w r i t e ( d a t a )

422 Compute Node The complete architecture for computenodes is depicted in Fig 2 and consists of

Containers We use Docker containers to sandbox functions sothey have neither direct network access nor system execution andthread creation capabilities nor access to the local file system Onlya minimum ephemeral hard disk space is provided under the tmpdirectory to be used by functions We used the Ubuntu 1604 Dockerimage [38] for our implementation

Zion Service The Zion service is a WSGI server whose missionis to spawn workers for processing the request forwarded by theinterception middleware Consequently it must be up and runningin all compute nodes because as described in 322 it is responsiblefor managing containers and executing functions When a functionis running this service waits until receiving a response from itThe waiting time is limited by the default systemrsquos timeout or by auser-defined timeout specified at deployment time

Runtime For the runtime we created a Docker image based onUbuntu 1604 with the Java runtime installed on it to rapidly spawnnew containers and execute functions

As stated in previous section each different function is run in aseparate container If the compute node does not have any workeravailable for that function the Zion service takes a Docker con-tainer from the pool of containers and loads the function code andlibraries Subsequent requests will then be able to reuse the sameworker when processing the same ldquohotrdquo function

One feature of our functions is that it is possible to modifythe objectrsquos metadata (keyvalue) via the Context By defaultSwift stores 3 copies of each object However Swift is an eventuallyconsistent system This means that at a given time the replicas

125

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

Zion Service

Java Runtime

Java VM

Function 1

Function 1Worker 1

Java Runtime

Java VM

Function 1

Function 1Worker 2

Java Runtime

Java VM

Function 2

Function 2Worker 1

ToFrom Proxy TF Storage Node

Java Runtime

Docker waiting for a new function worker

Figure 2 Zion Compute Node High level architectureoverview of the components of a Zion Compute Node

of an object may have different data and metadata As functionsare stateless ie there is no relation between the different functioninvocations even if they occur in the same worker modifying thelocal copy of an objectrsquos metadata implies waiting for Swift toupdate all the replicas This behavior is clearly undesirable in termsof functionrsquos parallelism

To mitigate this issue each function worker is directly attachedto a distributed metadata caching system based on Redis It is aninternal feature totally transparent to the users who develop func-tions When a function is executed it initially loads the objectrsquosmetadata into the cache and then performs all the metadata modi-fications over it If there are other requests to the same object theother function invocations will use the previously-loaded metadataWhen the function terminates it offloads the metadata from thecache to all objectrsquos replicas

With this distributed cache we ensure that the objectsrsquo metadatatouched by the functions is always consistent in all replicas evenif there are multiple requests at the same time With it functionscan even do atomic operations over the metadata fields

5 APPLICATIONSHere we show some of the applications that Zion can support andthe ease with which these applications can be built on top of SwiftAs expected the described use cases are not arbitrary they havebeen chosen to show the potential of our data-driven model and inparticular for synchronous interactions with the object store Allof them have been implemented over Swift and evaluated in thenext section

Content-level Access Control With Zion it is extremely easyto implement sophisticated forms of access control based upon thecontents of the objects themselves This example is ideal to showthe limitations of AWS Lambda First it clear that access controlrequires the interception of the GET requests which can only bedone indirectly with the help of the API Gateway service Secondthe processing of the objectrsquos content to satisfy the access controlpolicy must be done inline as the user needs a timely notificationabout the status of her request which again cannot be realizedwith AWS Lambda Finally Zionrsquos design enables access controlto leverage data locality and improve response time as functions

go through the contents of the object as it is read from the storagenode

Typically access control in object storage operates at the gran-ularity of containers and hence once an object is accessible tosome party he gets the full content of the object Swift also followsthis ldquoall or nothingrdquo approach where the access to objects insidea container is enforced through access control lists (ACLs) whichmay be insufficient when objects contain sensitive content

In the exercise to show another capability of our framework weshow how content level access control can be realized very easilyin Swift thanks to our function abstraction in a similar fashion butcleaner than [7] By ldquocontent levelrdquo we mean that Swift users will beable to access to certain parts of an object based on their credentialsTo give a concrete example consider the publicly available Adultdataset from the UCI Machine Learning Repository [39] which con-tains about 48 000 rows of census information Each row containsattributes like race sex and marital-status which combinedwith explicit identifiers such as the SSN1 that identify the recordholders may leak sensitive information about a given individualAs a result the records of this object should be accessed differ-ently depending upon the user role For instance while a ldquopoliceagentrdquo should be able to access to all fields SSN age educationmarital-status race sex relationship capital-gain andnative-country a ldquocensus analystrdquo could be restricted to get onlya smaller view age education capital-gain native-country

To implement this example we have linked a function to theonAfterGet object trigger to enforce content level access controlon the object adult_datacsv We have defined a simple accesspolicy that depending on the use role ldquopolice agentrdquoor ldquocensusanalystrdquo allows to get all the fields or just an smaller projectionview This simple access policy has been stored as implicit parameterof the function ie in a JSON formatted string uploaded when welinked the object with the function as explained in 321 When aGET request comes for the object adult_datacsv the proxy firstchecks the Swift ACL If the object is accessible by that user thefunction then reads the content level policy and filters the dataonly if the user has the appropriate role

Compression A typical data reduction task is the compressionof objects In general any dynamic content filtering that involvesinline transformations of the content is ideally suited for our data-driven model A data-driven task could compressdecompress a filedynamically ldquoon-the-flyrdquo As Zion acts directly on the data pipelineie as the object is readwrite to the object store the result willbe either latency improvement or space reduction depending uponwhether Zion intercepts a GET request or a PUT request

Here we will merely consider the case of compressing incomingobjects ldquoon-the-flyrdquo which means that upon a GET request by theuser the target object will have to undergo decompression Thecommon approach for doing so in Swift is to implement an ad-hoccompression middleware However this approach is problematicFirst by running compression on the storage nodes compressionis repeated as many times are replicas there are By offloading it tothe proxies we can disrupt Swiftrsquos normal operation under heavy

1As the Adult dataset does not contain explicit identifiers we added a random SSN toeach row using the Fake Name Generator

126

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

load since the proxies are responsible for looking up the locationof objects in the rings and routing the requests accordingly

With Zion we can easily write down a compression functionand execute it between the proxies and the storage nodes in a scal-able way without worrying about resources or repeating the taskmany times In addition to the possibility to intercept GET requestsZion has another advantage over AWS Lambda It is capable to runcompression over objects whose total compression time exceed thefive minutesrsquo limit This is because Zion cancels a request only ifthe time for the receipt of the first byte from the function exceeds atimeout value (see Section 321 for details) This model is ideal foroperations such as compression that can be run as data is readwritewhich have been the focus of active storage for a long time

To implement the compression function we utilized gzip andthen we mapped it to a container with the onPut trigger As suchall writes to this container will be compressed We did the reverseprocess for the onAfterGet trigger so that when a user requestedan object of this container she would get the original uncompressedversion To do so we made use of an implicit parameter to tell thefunction what to do either to compress or decompress That is forthe onPut trigger the implicit parameter was set to ldquocompressionrdquoFor the onAfterGet trigger the implicit parameter value was setto ldquodecompressionrdquo

Image processing One of the archetypal use cases of serverlesscomputing is that of image resizing for we found it interesting toevaluate it here It is ideal to show the potentials of asynchronousevent-based functions such as AWS Lambdas and also very usefulfor tenants that use the object store as a back-end for web imagestorage In this case when the images are uploaded a function istriggered resizing and creating all the needed images for examplefor the different possible devices that can request the main web(phone tablet PC etc)

We did the same with Zion and coded a function that resizes animage to an arbitrary percentage To intercept the PUT requestswe linked it to an onPut trigger and specified jpg as the suffixof the object name So all objects whose name ended with jpgwere rescaled As Zion allows to create new objects as part of thefunctionrsquos output the function stores the original object and itsresized version(s)

Because of interception one interesting feature of Zion is that itdoes not require to fully store an image before the generation of itsresized version(s) as it occurs with AWS Lambda and S3 Since it isdone ldquoon-the-flyrdquo prior to storage Zion saves storage bandwidthAlthough of not much concern at first glance this property is veryinteresting for concurrently processing a vast collection of imagesas in [10] because IO savings add up rapidly for a large amount ofobjects

Signature verification To demonstrate the versatility of Zionwe proposed the signature verification use case Online contentchecking is again a perfect use case for data-driven functions sinceit requires an immediate response to the client

The main objective of this function is to verify that the objects(documents) that are uploaded are signed by the user and to verifythat the signature is valid ie the documents are authentic WithZion is possible to do this task in near-real time and notify the userinstantly in the case of rejection Also in the case of rejection we

prevent the archival of an inauthentic document thereby savingstorage space For the same reasons as above it is readily evidentthat this use case cannot be implemented with AWS Lambdas

The scenario is as follows The user obtains the public key fromher RSA key pair and she uploads it to a public container in theobject storage system Then the user signs the document with theprivate key and uploads it with the signature in a special header tothe object storage system which puts into execution the functionand verifies the document Note that in this case the function usesan explicit parameter (signature) described above

To do so we implemented a signature verification function Thefunction is mapped with a container to the onPut trigger Thereforeall object PUTs to this container will be enforced The function firstgets the public key from the object store based on the user who isuploading the object Then it loads the document content and itverifies the signature If it is valid the document is stored otherwisethe document is rejected sending an error message to the user

Interactive queries and result aggregation Finally interactivequeries is a use case that perfectly matches our data-driven modelWhen we want to perform fast data queries over existing datarepositories our data-driven model avoids moving the entire datasetto a computing cluster

For example object storage services are commonly used toarchive data like log files Businesses that want to extract fast in-sights from these data repositories using big data analytic toolsmust choose between two strategies 1 Moving data to the com-putation cluster to take advantage from data locality or 2 Usinga connector to allow data analytic tools to read data directly fromthe remote object storage With Zion we offer a third strategyCompute in the storage cluster using functions that filter data andaggregate results inline

The case example we present is a top-k query on access logsof UbuntuOne [16] a personal cloud service We want to obtaina list of the most active users and the total number of requestseach user sent The implementation of this query in Zion has twofunctions The first one receives a fragment of the log file and filtersrequests logs maintaining a counter for each user thus exploitingparallelism The other function receives as input the partial countersof the various instances of the first function and it performs theaggregation and sorting in order to produce the desired result Toimplement this mapreduce-like example we tapped into SwiftrsquosSLO-based data partitioning (see Section 41 for further details)

6 EVALUATIONFor confirming our suspicions we first studied how Swift behaveswhen some computation tasks are collocated in the same storagenode We then ran micro-benchmarks to measure the behavior andthe overheads of our functions We did so by running standaloneexperiments and the applications discussed in Section 5

61 System SetupOur experimental testbed consisted of a host (or client) with 2VCPUsand 4GB RAM On the server side we deployed Zion in an 8-machines rack OpenStack Swift (ocata version) installation formedby 2 proxy nodes Dell PowerEdge R320 with 12GB RAM and 6storage nodes Dell PowerEdge R320 with 8GB RAM (each one with

127

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

06

12

18

24

3

Respon

seTim

e(secon

ds) 1 Worker

2 Workers

3 Workers

4 Workers

(a) Transaction response time

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

20

40

60

80

100

CP

U U

sage

()

1 Worker

2 Workers

3 Workers

4 Workers

(b) Storage Node CPU Usage

Figure 3 Swift measurementMeasured response times (a) and CPU consumption for a given storage node (b) for getting objectsat a certain rate (transactions per second or TPS) for different values of Swift workers

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(a) 1 Object Server worker

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(b) 2 Object Server workers

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(c) 3 Object Server workers

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(d) 4 Object Server workers

Figure 4 Swift interference measurement These plots show how for a given storage node the swift object storage service behavesfor different number of workers and for different CPU loads of interference (Intf)

4 CPU cores) At the compute side the computation layer is com-posed by 3 nodes Dell PowerEdge R430 with 32GB RAM (each onewith 24 CPU cores) All the rack machines ran Ubuntu Server 1404The client host ran Ubuntu 16041 CloudImage [37]

62 Swift Resource ContentionWe first studied how collocation of compute and data affects SwiftTo do so in ldquoidealrdquo conditions we restricted this experiment to asingle storage node2 For this measurement we used a single proxynode Dell PowerEdge R320 with 12GB RAM and 1 storage nodeDell PowerEdge R320 with 8GB RAM

Base Swift The first experiment consisted of evaluating how Swiftnormally works (Fig 3) For this experiment we stored a bunchof random 10kB files into Swift Next using the unix httperftool we ran distinct workloads each one differing in the numberof transactions per second (TPS) We measured the resultant per-transaction response time (Fig 3a) and CPU usage (Fig 3b) of thestorage node

Technically Swift uses workers to handle a workload In Swift aworker is nothing but a thread that accepts requests Each worker

2 We note that the results are easy to extrapolate to larger Swift deployments

normally accepts 1 024 concurrent requests but it is a config-urable parameter This simple experiment confirmed us that whena worker exhausts the 100 of its corersquos resources the responsetime steeply increases due to the queuing delays For instance with1 worker the corersquos usage reaches 100 around 380 TPS the pointbeyond which the requests start to accumulate as shown in Fig3a This effect can be alleviated by starting new Swift workers asseen in this figure However since storage nodes have 4 cores onlySwift is limited to 4 workers And therefore a storage node willeventually end up experiencing resource contention sooner or latereven if there are no computations collocated with the data

Collocated Computation We repeated the same experiment butcollocating computations with the storage service The results inFig 4 show how the Swiftrsquos processing capacity diminishes as afunction of the CPU usage borrowed from the collocated tasks Forexample if Swift was restricted to use only 1 worker for requestprocessing collocated tasks would have almost no impact on SwiftHowever as the number of Swift workers increase to match thenumber of CPU cores the resource contention begins to producenegative effects in the storage system due to the interference fromthe collocated computations

128

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

Swiftrsquos best practices recommend to use as many workers asCPU cores Under a heavy use of the CPU by the collocated tasksthis figure shows that Swift request processing would be severelydiminished For instance when the collocated tasks consumed 80of the total CPU Swift processing fell to a 35 in this experimentThis result suggests that the right path to go is to move computationsoutside the storage nodes which is what we did with Zion

63 Application SetupNow we describe the specific evaluation setup for our example usecases As object stores can contain a huge diversity of object typesand sizes [16] we set up heterogeneous workloads for all use casesin order to verify the behavior of Zion under different scenarios

Content-level access control For testing this application weused the dataset described in Section 5 but trimmed down to differ-ent object sizes 100kB 1MB 10MB and 100MB As stated beforethe dataset content is filtered out according to the type of userthat requested the object To this aim we used Swift user roles toreturn only specific fields to each user The function associated tothe onAfterGet trigger reads from an implicit parameter the JSONstring containing the allowed fields for a specific userrsquos role (egage education marital-status) and then returns them to theuser

Compression It is well-known that the compression ratio of ob-jects affects resource consumption Objects that contain only zeroswill be compressed more quickly and consume less resources thancompressing a binary object To get rid of this issue we chose touse text documents with a similar compression ratio of around60 The documents were of sizes 10kB 100kB 1MB and 10MBrespectively

Image processing As in the previous case we tried to find themost common scenario for testing this function We focused onthose image sizes commonly used in static web pages Finally weused different jpg files of 90kB 400kB 800kB and 1200 kB andwe set an implicit parameter so that the function resizes the imageto the 50 of its original size

Signature verification A sample usage of this function may be toverify official documents and forms in a public institution For thisexperiment we used text documents also of different sizes 10 kB100 kB 1 MB and 10 MB These documents are signed with a RSAprivate key The experiment operates on PUT requests verifyingthe correction of the signature

Interactive data queries and result aggregation For this usecase we used different sizes of the publicly available UbuntuOnersquoslog file [16] 100MB 1GB and 10GB respectively We comparedZionrsquos functions execution time to those obtained using HadoopWe built a 10-node Hadoop cluster of commodity workstations4-core i5 at 253 GHZ and 16 GB of RAM For this application weissued a Pig query against 1 The log files stored in HDFS and 2Swift using the Apache Hadoop-Swift connector [5] For Zion first-stage filtering functions we picked chunk sizes of 10MB 32MBand 200MB for the 100MB 1GB and 10GB files respectively

Table 1 Function information

Application LOC Function Size

Content-level Access Control asymp 29 27 kBCompression asymp 8 18 kBImage processing asymp 17 23 kBSignature verification asymp 43 29 kBInteractive data query asymp 203 6 kB

08 085 09 095 1

Startup Time (seconds)

0

30

60

90

120

150

Numof

Dokcers

Figure 5 Zion runtime startup time Time in seconds that aDocker with the Zion Java runtime takes to start

64 Application characteristicsTable 1 shows information of our four Zion applications The LOCcolumn gives the number of lines of code required to execute thefunction The third column gives the Function size used for eachapplication From this table it can be seen that our functions arevery lightweight for the proposed applications

65 Results

Docker characteristics The Java runtime and then the functionsare run inside Docker containers in our prototype Consequentlythe first validation to do was to assess the impact of starting ourruntime within Docker containers If the startup time was too largeit would hamper the inline processing capabilities of Zion Ourexperiments however revealed that this is not the case

For this experiment we utilized a pre-compiled Docker imagewith our Java runtime integrated in it The experiment consisted oflaunching 1 000 containers at different rates in the compute clusterand measure the startup time and memory along with the amountof memory for keeping the runtime up and running over time

Fig 5 shows the results of this experiment As depicted startinga new Docker with the Java runtime takes between 085 minus 095seconds Regarding RAM consumption we got that each containerconsumes around 35MB of memory These results show how ourruntime takes 5X times less to start in comparison with the AWSLambdarsquos runtime which proves that Zion is lightweight enoughfor elastic inline processing As Zion utilizes a large pool of alreadystarted containers the start up time is typically negligible in manycases and only amounts to 09 secs if there are no free containersin the pool

129

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

001 0015 002 0025 003 0035 004

(a) Measured request times with Swift

001 0015 002 0025 003 0035 004

(b) Measured request times with Swift + Zion

Figure 6 Zion base overhead These boxplots show measuredtimes of requesting 10kB objects in Swift (a) and gettingthe same objects with a no-operation (nop) function troughZion (b)

Zion overhead As another basic experiment we studied the extraoverhead that Zion adds when running a function The overheadis the time needed by the interception middleware to redirect theobject to a compute node take an available container launch afunction inside the Docker and pass the object data through it Forthis experiment we made use of the function listed in Listing 1

This simple function iterates over the data without processingit For measuring the overhead we used a set of 10kB objects Wefirst launched 5 000 plain GET requests to measure the base timeneeded to complete a GET request Then we launched another5 000 GET requests for the same objects but in this case we set upthe function in Listing 1 to respond upon the onAfterGet triggerof the objects

The results are plotted in Fig 6 This figure shows that the Zionrsquosoverhead is 9 ms This time includes the penalty of 5 ms for theaddition of an internal hop to Swift (redirect the data through acompute node) plus 4 ms to determine whether and which functionto run inside a free Docker container in the pool when the requesttouches a compute node

Performance of collocated functions We evaluated here theperformance of our application functions For this evaluation weconducted a stress test for each combination of application objectsize and number of function workers As our storage nodes have 4CPU cores we ran each function on 1 2 3 and 4 function workersrespectively The Docker containers were set up as follows 1 CPUcore and 512MB of RAM per worker The results are depicted inFig 7-10

The first experiments were done collocating the containers inthe storage nodes without using the Zion computation layer Themain objective of this experiment was 1 To show how collocationlimits the number of TPS and 2 To show the performance of Zionrsquosfunctions using 1 CPU core per function worker As a side effectnotice that the latter makes it possible to faithfully generalize our

200 300 400 5000

500

1000

1500

2000

Respon

setime(m

s)

10 kB

1 worker 2 workers 3 workers 4 workers

200 300 400 5000

500

1000

1500

2000

100kB

40 60 80 100

TPS

0

500

1000

1500

Respon

setime(m

s)

1MB

5 10 15 20 25 30

TPS

0

1000

2000

3000

4000

10MB

Figure 7 Compression function performance

20 40 60 80 1000

500

1000

1500

2000

1 MB

5 10 15 20

Transactions per Second (TPS)

0

1000

2000

3000

Respon

setime(m

s)

10 MB

2 4 6 8 10

Transactions per Second (TPS)

0

2000

4000

6000

8000

100 MB

0 100 200 300 400 5000

500

1000

1500

2000

Respon

setime(m

s)

100 kB

1 worker 2 workers 3 workers 4 workers

Figure 8 Content-level access control function perfor-mance

results to any number of cores We empirically verified this for theimage resize use case As the cluster is made up of 6 storage nodesthe function with 1 worker was able to process around 9 images persecond for the 400kB image case and around 54 images per secondin the entire cluster For the rest of applications we utilized onlyone storage node The reason is that to tap into the performance ofthe entire cluster we would have to replicate 6 times each object toensure that a copy of it is available at every storage node In activestorage the total computation power for an object is proportionalto the number of replicas it has

Notice that in Fig 7- 10 the curves representing the responsetime start to increase when the resources of the Docker containerare overloaded ie when all the CPU cores are at the 100 of theircapacity These experiments also show how the object size is veryimportant In all experiments the higher the object size the lowerthe number of TPS a function can handle To wit for the content-level access control application (Fig 8) the function is unable tohandle more than 1 TPS for the 100MB object irrespective of thenumber of workers Concretely this task takes around 16 secondsto complete The same occurs with the image resizing function We

130

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

10 20 30 40 50 600

1000

2000

3000

4000

Respon

setime(m

s)

90 kB

1 worker 2 workers 3 workers 4 workers

10 20 30 400

500

1000

1500

2000

400 kB

5 10 15 20

Transactions per Second (TPS)

0

1000

2000

3000

4000

Respon

setime(m

s)

800 kB

2 4 6 8 10

Transactions per Second (TPS)

0

1

2

3

4times104

1200 kB

Figure 9 Image resizer function performance

100 150 200 250 3000

500

1000

1500

2000

Respon

setime(m

s)

10 kB

1 worker 2 workers 3 workers 4 workers

50 100 150 200 2500

500

1000

1500

2000

100 kB

20 40 60 80 100 120

Transactions per Second (TPS)

0

500

1000

1500

Respon

setime(m

s)

1 MB

5 10 15 20 25 30

Transactions per Second (TPS)

0

500

1000

1500

2000

10 MB

Figure 10 Signature verification function performance

saw that resizing an image of 1200kB takes around 236 secondsleading to lt 1 TPS

A function that needs more than 1 second to process an objectit can only handle in parallel as many objects as functions workersare running These examples show that function collocation at thestorage nodes would be in any case insufficient to absorb a burst oftransactions for more than one second Also they demonstrate thatit is almost impossible to predict resource consumption ahead oftime because resources depend on the object size Consequentlyresource management for collocated functions should be dynamicwhich is hard to achieve in practice instead of a simple resourcemanagement policy such as one single worker per CPU core

Performance of non-collocated functions As resource con-tention is not an issue with non-collocated functions what is keyhere is to verify that Zionrsquos storage-disaggregated functions areindeed scalable

In order to test how the Zionrsquos compute layer provides betterscalability than the storage nodes we extended the prior evaluationto non-collocated functions Our compute nodes have 24 cores eachone This means that is possible to start up to 24 function workers in

0 1 2 3 4 5 6 7 810

20

30

40

Max

TPS

90 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 85

10

15

20

25

400 kB

0 1 2 3 4 5 6 7 8

Workers

0

2

4

6

8

Max

TPS

800 kB

0 1 2 3 4 5 6 7 8

Workers

-1

0

1

2

3

1200 kB

Figure 11 Image resizer function scalability

0 1 2 3 4 5 6 7 8100

150

200

250

300

Max

TPS

10 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 850

100

150

200

250

100 kB

0 1 2 3 4 5 6 7 8

Workers

0

50

100

150

200

Max

TPS

1 MB

0 1 2 3 4 5 6 7 8

Workers

6

8

10

12

14

10 MB

Figure 12 Signature verification function scalability

each node However to make the comparative fair with the storagenodes we utilized between 1 to 8 function workers

As in the previous experiments we used 1 proxy and 1 storagenode in addition to 1 compute node which is enough to assessthe scalability of Zion The results are shown in Fig 13- 12 In thiscase we recorded the number of maximum transactions per second(Max TPS) that each worker was able to handle with and withoutcollocation

First we can see how in almost all cases with 1 to 4 functionworkers non-collocated functions can handle more transactionsthan the storage nodes This is due to the fact that the computenodes are more powerful than the storage nodes in terms of CPUTherefore the capacity of ingestion is slightly higher Second themaximum TPS that a storage node can handle is always limitedabove by the number of available cores That is spawning moreworkers has no benefit because they will contend for the availableresources This is the reason why the curve for collocated functions(gray line) flattens out beyond 4 workers in all use cases

However non-collocated functions (black line) can continue toprocess transactions by spawning more workers to respond to thedemand By a quick inspection of all the figures the scalability ofnon-collocated functions is practically linear with the number of

131

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

0 1 2 3 4 5 6 7 8200

400

600

800

1000

Max

TPS 10 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 80

200

400

600

800

100 kB

0 1 2 3 4 5 6 7 8

Workers

20

40

60

80

Max

TPS 1 MB

0 1 2 3 4 5 6 7 8

Workers

0

5

10

15

10 MB

Figure 13 Compression function scalability

0 1 2 3 4 5 6 7 80

200

400

600

Max

TPS

100 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 820

40

60

80

1 MB

0 1 2 3 4 5 6 7 8

Workers

0

5

10

Max

TPS

10 MB

0 1 2 3 4 5 6 7 8

Workers

0

1

2

3

100 MB

Figure 14 Content-level access control function scalability

workers This suggests that disaggregating storage and compute is apractical solution to scale out computation in cloud object stores

Interactive data queries Table 2 compares the execution timesfor the same query for Hadoop and Zion (Listing 2) The entries ofthis table were obtained by running 30X each configuration andthen averaging the results

Non-surprisingly the Hadoop configuration that ingests the datafrom the remote Swift service is the one that presents the highestmean execution time This occurred due to network contentionBoth clusters were geographically far apart so communicationwent through some FAST Ethernet links within our institutionrsquosLAN More concretely it took asymp 917 seconds to transfer the entire10GB dataset and asymp 86 seconds to transfer the 1GB dataset

The configuration that read the data from HDFS had better timesspecially for the large datasets However it has the disadvantagethat it requires pre-loading all the data in HDFS which may be notpractical in many cases

For Zion functions the execution times were comparativelysmall despite using at most 24 workers This suggests that bettertimes can be easily attained with more workers This is clearly seenwhen comparing Zion with 12 and 24 workers respectively For

Table 2 Interactive data queries execution times

Configuration File size Chunk size Time

Pig query - Swift 100 MB 10 MB 816sPig query - HDFS 100 MB 10 MB 714sZion - 12 Workers 100 MB 10 MB 0510sZion - 24 Workers 100 GB 10 MB 0348s

Pig query - Swift 1 GB 32 MB 156sPig query - HDFS 1 GB 32 MB 754sZion - 12 Workers 1 GB 32 MB 4183sZion - 24 Workers 1 GB 32 MB 2256s

Pig query - Swift 10 GB 200 MB 985sPig query - HDFS 10 GB 200 MB 946sZion - 12 Workers 10 GB 200 MB 26253sZion - 24 Workers 10 GB 200 MB 13392s

the 10GB dataset and 12 workers it took 2X much more time thanwith 24 workers Also this experiment confirms the scalability ofZion without incurring in resource contention in the storage clusterIf we had executed this interactive query with 24 workers in thestorage nodes the storage cluster CPU would had reached 100 inall storage nodes for 134 seconds leading to resource contention

Listing 2 Evaluated query in Hadoop and Swift + Zion clus-terss e l e c t u s e r _ i d count ( lowast ) t o t a l

where ( r e q _ t = GetContentResponse or r e q _ t = PutContentResponse )

and msg= Reques t done group by u s e r _ i d order by t o t a l DESC l imi t 10

7 CONCLUSIONThis paper presents Zion an innovative data-driven serverlesscomputing middleware for object storage Unlike commercial event-driven serverless models our data-driven functions intercept andoperate over objects as they are readwrite fromto the object storeSince Zion is serverless it overcomes the scalability and resourcecontention problems of active storage without the need to managea server or a runtime environment

By injecting computations in the data pipeline Zion is ideal foruse cases that require synchronous interactions with external usersExamples of these use cases include (but are not limited to) dynamiccontent generation interactive queries personalization contentverification and access control In many of them the data localityof our inline computations contributes to optimize latency and toreduce data transfers outside the data center

ACKNOWLEDGMENTSThis work has been partially funded by the European Union Horizon2020 Framework Programme in the context of the project IOStackSoftware-defined Storage for Big Data (H2020-644182) and by theSpanish Ministry of Science and Innovation (Grant TIN2016-77836-C2-1-R)

132

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

REFERENCES[1] Amazon 2017 Athena httpsawsamazoncomathena (2017)[2] Amazon 2017 Lambda httpsawsamazoncomlambda (2017)[3] Amazon 2017 Redshift Spectrum httpsawsamazoncomredshiftspectrum

(2017)[4] Apache 2012 HBASE Coprocessors httpsblogsapacheorghbaseentry

coprocessor_introduction (2012)[5] Apache 2017 Hadoop OpenStack Support Swift Object Store httpshadoop

apacheorgdocsstablehadoop-openstackindexhtml (2017)[6] Krste Asanovic and D Patterson 2014 Firebox A hardware building block for

2020 warehouse-scale computers In FAST[7] Prosunjit Biswas Farhan Patwa and Ravi Sandhu 2015 Content Level Access

Control for OpenStack Swift Storage In ACM CODASPY 123ndash126[8] Chao Chen Yong Chen and Philip C Roth 2012 Dosas Mitigating the resource

contention in active storage systems In CLUSTER 164ndash172[9] Facebook 2017 Presto Distributed SQL Engine for Big Data httpsprestodbio

(2017)[10] Sadjad Fouladi Riad S Wahby Brennan Shacklett Karthikeyan Vasuki Balasubra-

maniam William Zeng Rahul Bhalerao Anirudh Sivaraman George Porter andKeith Winstein 2017 Encoding Fast and Slow Low-Latency Video ProcessingUsing Thousands of Tiny Threads In NSDI

[11] Peter X Gao Akshay Narayan Sagar Karandikar Joao Carreira Sangjin HanRachit Agarwal Sylvia Ratnasamy and Scott Shenker 2016 Network require-ments for resource disaggregation In OSDI 249ndash264

[12] Google 2009 BigTable Coprocessors httpswwwcscornelleduprojectsladis2009talksdean-keynote-ladis2009pdf (2009)

[13] Google 2017 Cloud Functions httpscloudgooglecomfunctionsdocs (2017)[14] Rauacutel Gracia-Tinedo Pedro Garciacutea-Loacutepez Marc Saacutenchez-Artigas Josep Sampeacute

Yosef Moatti Eran Rom Dalit Naor Ramon Nou Toni Corteacutes Pietro Michiardiand William Oppermann 2016 IOSTACK Software-defined Object StorageIEEE Internet Computing (2016)

[15] Rauacutel Gracia-Tinedo Josep Sampeacute Edgar Zamora Marc Saacutenchez-Artigas PedroGarciacutea-Loacutepez Yosef Moatti and Eran Rom 2017 Crystal Software-DefinedStorage for Multi-Tenant Object Stores In FAST

[16] Rauacutel Gracia-Tinedo Yongchao Tian Josep Sampeacute Hamza Harkous John LentonPedro Garciacutea-Loacutepez Marc Saacutenchez-Artigas and Marko Vukolic 2015 DissectingUbuntuOne Autopsy of a Global-scale Personal Cloud Back-end In IMC 155ndash168

[17] Scott Hendrickson Stephen Sturdevant Tyler Harter VenkateshwaranVenkataramani Andrea C Arpaci-Dusseau and Remzi H Arpaci-Dusseau 2016Serverless computation with OpenLambda In HotCloud

[18] HP 2017 The Machine Our vision for the Future of Computing httpswwwlabshpecomthe-machine (2017)

[19] IBM 2017 OpenWhisk httpsdeveloperibmcomopenwhisk (2017)[20] Eric Jonas Shivaram Venkataraman Ion Stoica and Benjamin Recht 2017

Occupy the Cloud Distributed computing for the 99 arXiv preprintarXiv170204024 (2017)

[21] Microsoft 2017 Azure Functions httpsazuremicrosoftcomen-usservicesfunctions (2017)

[22] Netflix 2016 Evolution of the Netflix Data Pipeline httptechblognetflixcom201602evolution-of-netflix-data-pipelinehtml (2016)

[23] OpenStack 2017 Nova httpsdocsopenstackorgnova (2017)[24] OpenStack 2017 Swift httpsdocsopenstackorgswift (2017)[25] OpenStack 2017 Swift Large Objects httpsdocsopenstackorgswiftlatest

overview_large_objectshtml (2017)[26] Juan Piernas Jarek Nieplocha and Evan J Felix 2007 Evaluation of active storage

strategies for the lustre parallel file system In SC 28[27] Lingjun Qin and Dan Feng 2006 Active Storage Framework for Object-based

Storage Device In AINA 97ndash101[28] Simona Rabinovici-Cohen Ealan Henis John Marberg and Kenneth Nagin 2014

Storlet engine performing computations in cloud storage Technical Report IBMTechnical Report H-0320 (August 2014)

[29] Paul Rad Van Lindberg Jeff Prevost Weining Zhang and Mo Jamshidi 2014ZeroVM secure distributed processing for big data analytics In WAC 1ndash6

[30] Redis 2017 Redis httpsredisio (2017)[31] Erik Riedel Garth A Gibson and Christos Faloutsos 1998 Active Storage for

Large-Scale Data Mining and Multimedia In VLDB 62ndash73[32] Michael T Runde Wesley G Stevens Paul A Wortman and John A Chandy 2012

An active storage framework for object storage devices In MSST 1ndash12[33] Josep Sampeacute 2017 Zion framework httpsgithubcomJosepSampezion

(2017)[34] Josep Sampeacute Pedro Garciacutea-Loacutepez and Marc Saacutenchez-Artigas 2016 Vertigo

Programmable micro-controllers for software-defined object storage In IEEECLOUD 180ndash187

[35] Seung Woo Son Samuel Lang Philip Carns Robert Ross Rajeev Thakur BerkinOzisikyilmaz Prabhat Kumar Wei-Keng Liao and Alok Choudhary 2010 En-abling active storage on parallel IO software stacks In MSST 1ndash12

[36] Zhipeng Tan Yanli Yuan Dan Feng Tian Zhan and Xudong Tu 2011 Imple-mentation of method object in active object-based storage systems In CCIS204ndash211

[37] Ubuntu 2017 Cloud image repository httpscloud-imagesubuntucom (2017)[38] Ubuntu 2017 Docker repository httpshubdockercomr_ubuntu (2017)[39] Irvine University of California 1996 Adult Data Set httparchiveicsuciedu

mldatasetsAdult (1996)[40] Himanshu Vashishtha and Eleni Stroulia 2011 Enhancing query support in

hbase via an extended coprocessors framework In ServiceWave 75ndash87[41] Rajiv Wickremesinghe Jeffrey S Chase and Jeffrey Scott Vitter 2002 Distributed

computing with load-managed active storage In HPDC 13ndash23[42] Yulai Xie Dan Feng Yan Li and Darrell DE Long 2016 Oasis an active storage

framework for object storage platform Future Generation Computer Systems 56(2016) 746ndash758

133

  • Abstract
  • 1 Introduction
  • 2 Related Work
  • 3 Design Overview
    • 31 Interception Software and Metadata Service
    • 32 Computation Layer
      • 4 Implementation
        • 41 Interception Software and Metadata Service
        • 42 Computation Layer
          • 5 Applications
          • 6 Evaluation
            • 61 System Setup
            • 62 Swift Resource Contention
            • 63 Application Setup
            • 64 Application characteristics
            • 65 Results
              • 7 Conclusion
              • Acknowledgments
              • References
Page 4: Data-Driven Serverless Functions for Object Storagefaculty.washington.edu/wlloyd/courses/tcss562... · using AWS Lambda for data-intensive tasks over disaggregated storage [10, 20]

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

In addition to processing the data stream functions can storeinformation in a persistent way (eg an access counter the times-tamp of the last access the name of the user who is accessing etc)Concretely a function can take one or all of the following actionsafter intercepting a lifecycle request of an object

bull It can update the object metadata (eg an access counter)bull It can generate new requests to the object storage service

This includes GET an object for example a dependencyneeded to process the main stream PUT an object forexample PUT a subset of the main object as it is processedDelete an object and POST metadata to another object

bull It can generate new requests to other services (eg Rab-bit Mongodb etc) for example to store some relevantinformation extracted of the main stream

bull It can update the requestresponse headers andbull It can cancel or rewire the request to another object

Functions may make use of third-party library dependenciesin order to achieve a specific behavior Once developed functionsshould be packed with them within a TAR file Therefore in con-junction functions and their dependencies must be lightweight insuch a way to minimize the time needed to transfer the functionpackage from the storage system to the compute node

Once packed a function is uploaded as a regular object to theobject store An interesting feature of Zion is that it allows theuser to set up the CPU and memory requirements and a timeoutvalue for every function The timeout is the amount of time thesystem waits to receive the first byte of the functionrsquos output Ifthe function times out the request is automatically canceled Thisinformation allows functions to be configured differently in order tobetter manage certain jobs This information is not mandatory andZion has the last word assigning default values when necessary

Further Zionrsquos functions accept parameters Parameters can beexplicit or implicit Explicit parameters are provided as headers inthe request Implicit parameters are default parameters that a userspecifies ahead of time in the moment of associating a trigger witha function Explicit parameters take precedence over implicit onesin the case of collision As an example consider a function to resizean image and the image resolution as a parameter If no argumentwas passed in the request the implicit image resolution would betaken from the functionrsquos metadata or an error would be thrownaccordingly The same function can have different parameter setsfor different triggers

A final important remark is that two different functions cannotintercept the same data flow in Zion unless they do it in a pipelinefashion one after another which raises no consistency issues

322 Compute Node In a compute node functions are runinside isolated environments or containers

Containers Each function has its own Linux container in orderto not interfere with the other cloud functions A container with afunction running inside is called a worker A function may have zeroone or more workers running at the same time depending on theworkload In the traditional function model starting a new workertakes around 6 minus 7 seconds [17] One requirement of our model isthat functions have to start running as soon as possible for thisreason we leverage a ready-to-use pool of containers In any case

our experiments verify that starting a new container takes around09 seconds which is practical enough for many synchronous andnear-real-time applications After a certain period of time the idleworkers are stopped and the corresponding containers are recycledin order to better optimize resource consumption

Zion Service This service manages the requests forwarded fromthe interception layer (31) When a new request arrives it takes acontainer from the pool installs the libraries and the function codeand sends the execution command to the runtime As functions mayhave specific CPU and memory requirements this service is also incharge of establishing the resource limits to containers according totheir configuration parameters It also load balances the workloadacross already started workers and starts new workers if necessary

Runtime The runtime is integrated into the containers It acceptsfunctions for a specific programming language and puts them intoexecution Our prototype of Zion currently supports Java but otherlanguages such as Python would be easy to integrate

4 IMPLEMENTATIONWe have implemented a prototype of our serverless framework [33]on top of OpenStack Swift [24] Swift is a highly scalable object storethat can store a large amount of data through a RESTful HTTP APIsimilar to that of Amazon S3 The access path to an object consistsof exactly three elements accountcontainerobject

The object is the exact data input by the user while accounts andcontainers provide a way of grouping objects Nesting of accountsand containers is not supported

Swift is split into several components which include accountcontainer and object servers the latter responsible for handlingstorage and management of objects All of these servers are typi-cally deployed in the storage nodes themselves In addition proxyservers expose the Swift API to users and stream objects to andfrom the client upon request In Swift data replication and dataplacement is provided by rings Each ring is a storage policy thatcontains the storage nodes and the HDDs where to store the objects

We have decided to make Zion the less intrusive as possible Asa result the only modification in the base Swift architecture is aSwift middleware which intercepts the requests at the proxy sideThe other elements are decoupled of the main Swift architecturewhich makes Zion easier to deploy

41 Interception Software and Metadata ServiceIn Swift the simplest way to intercept requests is to create a Swiftmiddleware We built a new Swift interception middleware forZion to accomplish two primary tasks 1 The management of func-tion code deployment and libraries including the triggers thatcause the functions to be run and 2 Redirection of requests andresponses through the computation layer when they need to beprocessed by any function

Upon the assignment of a function a trigger header of the typeonPut onBeforeGet onAfterGet onPost and onDelete must beappointed to tell the framework which lifecycle events to inter-cept Zion uses Redis [30] a quick in-memory key-value store asdistributed metadata service to maintain this information To opti-mize request matching Redis is collocated with the proxy nodes

124

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

Recall that as part of the metadata Zion also includes configura-tion information for the functions such as the CPU and memoryrequirements and cancellation timeouts as we already discussedin the preceding section

Function assignment granularity Although OpenStack Swiftstores the objects in a flat name space is possible to simulate ahierarchical structure within a single container by adding forwardslash characters () in the object name Each folder between thecontainer and object names is called a pseudo-folder in Swift Forexample in the object name imageszionjpg the prefix imagesis the pseudo-folder

In Zion we take advantage of this logical hierarchy to enablefunction assignment at per-object pseudo-folder and container lev-els Moreover we also enable mappings at suffix level for exampleto run functions to all objects whose name ends with jpg

Function execution After the function assignment process whena request arrives for an object the Zionrsquos middleware with the helpof the metadata service checks if that request triggers the executionof a function For example if the request is a PUT the middlewarewill launch an onPut trigger for the request If there are functionsthat respond to this trigger the middleware will immediately for-ward the request to an available container Otherwise the requestwill follow the default readwrite path

Parallel processingwith functions In all object stores there is alimit in the maximum allowed object size In Swift this limit is 5GBTo bypass this limitation Swift uses a special object called StaticLarge Object [25] (SLO) SLOs are objects split into differentparts The user must upload these parts together with a specialobject called manifest which contains the location of the objectparts Getting an object is totally transparent to the users whomake a request to the manifest and the whole object is returned asif it was stored as a single object in Swift

This Swift SLO approach enables Zion to associate a differentfunction to the manifest and to the parts With this mechanism itis possible to create a highly parallel and distributed computationalsubstrate by executing a function to each dataset part and finallyby running a reduction function to the filtered parts In Section 5we present a Hadoop-like use case for object storage implementedwith our functions This is aligned with the recent trend of largescale data analytics with serverless computing [10 20]

42 Computation LayerIn our implementation the computation layer is composed by apool of compute nodes They are located between the proxies andthe storage nodes as shown in Fig 1 We note at this moment thatthe compute nodes assigned to Zion are not shared with OpenStackNova [23] they are exclusively managed by Zion

A general overview of how Zion operates is the following Eachfunction is run inside a separate Docker container what is called aworker At the time of this writing Zionrsquos runtime is Java-basedAnd consequently every function is run in a Java Virtual Machine(JVM) At a very high level a worker can be viewed as a containerrunning a specific function Every new invocation to the functionis handled by a new thread inside the JVM

421 Functions As the Zionrsquos runtime is based on Java Zionrsquosfunctions are also written in Java This means that functions areplain Java classes They have an entry point called invoke whichcontains the main code of the function and that it is called uponevery new request The invoke method has two arguments of typeContext and API which are made available automatically by theZionrsquos runtime on every new request

The Context encapsulates the access to the request headers theobjectrsquos metadata and the objectrsquos data stream It also includes alogger facility for logging the execution of the function The APIenables access to external services By default the isolation level ofDocker containers precludes functions from having any access toexternal resources Through the API a function can access to someexternal services such as RabbitMQ and Redis in addition to Swiftitself The API class is extensible and has been framed to facilitateexternal interaction to future services

Here we have the implementation code of a simple function thatiterates over the data with no further processing

Listing 1 A function that echoes the data passed to itpublic c l a s s Handler implements I F u n c t i o n

public void invoke ( Contex t c tx API a p i ) while ( ( d a t a = c t x o b j e c t s t ream read ( ) ) )

c t x o b j e c t s t ream w r i t e ( d a t a )

422 Compute Node The complete architecture for computenodes is depicted in Fig 2 and consists of

Containers We use Docker containers to sandbox functions sothey have neither direct network access nor system execution andthread creation capabilities nor access to the local file system Onlya minimum ephemeral hard disk space is provided under the tmpdirectory to be used by functions We used the Ubuntu 1604 Dockerimage [38] for our implementation

Zion Service The Zion service is a WSGI server whose missionis to spawn workers for processing the request forwarded by theinterception middleware Consequently it must be up and runningin all compute nodes because as described in 322 it is responsiblefor managing containers and executing functions When a functionis running this service waits until receiving a response from itThe waiting time is limited by the default systemrsquos timeout or by auser-defined timeout specified at deployment time

Runtime For the runtime we created a Docker image based onUbuntu 1604 with the Java runtime installed on it to rapidly spawnnew containers and execute functions

As stated in previous section each different function is run in aseparate container If the compute node does not have any workeravailable for that function the Zion service takes a Docker con-tainer from the pool of containers and loads the function code andlibraries Subsequent requests will then be able to reuse the sameworker when processing the same ldquohotrdquo function

One feature of our functions is that it is possible to modifythe objectrsquos metadata (keyvalue) via the Context By defaultSwift stores 3 copies of each object However Swift is an eventuallyconsistent system This means that at a given time the replicas

125

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

Zion Service

Java Runtime

Java VM

Function 1

Function 1Worker 1

Java Runtime

Java VM

Function 1

Function 1Worker 2

Java Runtime

Java VM

Function 2

Function 2Worker 1

ToFrom Proxy TF Storage Node

Java Runtime

Docker waiting for a new function worker

Figure 2 Zion Compute Node High level architectureoverview of the components of a Zion Compute Node

of an object may have different data and metadata As functionsare stateless ie there is no relation between the different functioninvocations even if they occur in the same worker modifying thelocal copy of an objectrsquos metadata implies waiting for Swift toupdate all the replicas This behavior is clearly undesirable in termsof functionrsquos parallelism

To mitigate this issue each function worker is directly attachedto a distributed metadata caching system based on Redis It is aninternal feature totally transparent to the users who develop func-tions When a function is executed it initially loads the objectrsquosmetadata into the cache and then performs all the metadata modi-fications over it If there are other requests to the same object theother function invocations will use the previously-loaded metadataWhen the function terminates it offloads the metadata from thecache to all objectrsquos replicas

With this distributed cache we ensure that the objectsrsquo metadatatouched by the functions is always consistent in all replicas evenif there are multiple requests at the same time With it functionscan even do atomic operations over the metadata fields

5 APPLICATIONSHere we show some of the applications that Zion can support andthe ease with which these applications can be built on top of SwiftAs expected the described use cases are not arbitrary they havebeen chosen to show the potential of our data-driven model and inparticular for synchronous interactions with the object store Allof them have been implemented over Swift and evaluated in thenext section

Content-level Access Control With Zion it is extremely easyto implement sophisticated forms of access control based upon thecontents of the objects themselves This example is ideal to showthe limitations of AWS Lambda First it clear that access controlrequires the interception of the GET requests which can only bedone indirectly with the help of the API Gateway service Secondthe processing of the objectrsquos content to satisfy the access controlpolicy must be done inline as the user needs a timely notificationabout the status of her request which again cannot be realizedwith AWS Lambda Finally Zionrsquos design enables access controlto leverage data locality and improve response time as functions

go through the contents of the object as it is read from the storagenode

Typically access control in object storage operates at the gran-ularity of containers and hence once an object is accessible tosome party he gets the full content of the object Swift also followsthis ldquoall or nothingrdquo approach where the access to objects insidea container is enforced through access control lists (ACLs) whichmay be insufficient when objects contain sensitive content

In the exercise to show another capability of our framework weshow how content level access control can be realized very easilyin Swift thanks to our function abstraction in a similar fashion butcleaner than [7] By ldquocontent levelrdquo we mean that Swift users will beable to access to certain parts of an object based on their credentialsTo give a concrete example consider the publicly available Adultdataset from the UCI Machine Learning Repository [39] which con-tains about 48 000 rows of census information Each row containsattributes like race sex and marital-status which combinedwith explicit identifiers such as the SSN1 that identify the recordholders may leak sensitive information about a given individualAs a result the records of this object should be accessed differ-ently depending upon the user role For instance while a ldquopoliceagentrdquo should be able to access to all fields SSN age educationmarital-status race sex relationship capital-gain andnative-country a ldquocensus analystrdquo could be restricted to get onlya smaller view age education capital-gain native-country

To implement this example we have linked a function to theonAfterGet object trigger to enforce content level access controlon the object adult_datacsv We have defined a simple accesspolicy that depending on the use role ldquopolice agentrdquoor ldquocensusanalystrdquo allows to get all the fields or just an smaller projectionview This simple access policy has been stored as implicit parameterof the function ie in a JSON formatted string uploaded when welinked the object with the function as explained in 321 When aGET request comes for the object adult_datacsv the proxy firstchecks the Swift ACL If the object is accessible by that user thefunction then reads the content level policy and filters the dataonly if the user has the appropriate role

Compression A typical data reduction task is the compressionof objects In general any dynamic content filtering that involvesinline transformations of the content is ideally suited for our data-driven model A data-driven task could compressdecompress a filedynamically ldquoon-the-flyrdquo As Zion acts directly on the data pipelineie as the object is readwrite to the object store the result willbe either latency improvement or space reduction depending uponwhether Zion intercepts a GET request or a PUT request

Here we will merely consider the case of compressing incomingobjects ldquoon-the-flyrdquo which means that upon a GET request by theuser the target object will have to undergo decompression Thecommon approach for doing so in Swift is to implement an ad-hoccompression middleware However this approach is problematicFirst by running compression on the storage nodes compressionis repeated as many times are replicas there are By offloading it tothe proxies we can disrupt Swiftrsquos normal operation under heavy

1As the Adult dataset does not contain explicit identifiers we added a random SSN toeach row using the Fake Name Generator

126

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

load since the proxies are responsible for looking up the locationof objects in the rings and routing the requests accordingly

With Zion we can easily write down a compression functionand execute it between the proxies and the storage nodes in a scal-able way without worrying about resources or repeating the taskmany times In addition to the possibility to intercept GET requestsZion has another advantage over AWS Lambda It is capable to runcompression over objects whose total compression time exceed thefive minutesrsquo limit This is because Zion cancels a request only ifthe time for the receipt of the first byte from the function exceeds atimeout value (see Section 321 for details) This model is ideal foroperations such as compression that can be run as data is readwritewhich have been the focus of active storage for a long time

To implement the compression function we utilized gzip andthen we mapped it to a container with the onPut trigger As suchall writes to this container will be compressed We did the reverseprocess for the onAfterGet trigger so that when a user requestedan object of this container she would get the original uncompressedversion To do so we made use of an implicit parameter to tell thefunction what to do either to compress or decompress That is forthe onPut trigger the implicit parameter was set to ldquocompressionrdquoFor the onAfterGet trigger the implicit parameter value was setto ldquodecompressionrdquo

Image processing One of the archetypal use cases of serverlesscomputing is that of image resizing for we found it interesting toevaluate it here It is ideal to show the potentials of asynchronousevent-based functions such as AWS Lambdas and also very usefulfor tenants that use the object store as a back-end for web imagestorage In this case when the images are uploaded a function istriggered resizing and creating all the needed images for examplefor the different possible devices that can request the main web(phone tablet PC etc)

We did the same with Zion and coded a function that resizes animage to an arbitrary percentage To intercept the PUT requestswe linked it to an onPut trigger and specified jpg as the suffixof the object name So all objects whose name ended with jpgwere rescaled As Zion allows to create new objects as part of thefunctionrsquos output the function stores the original object and itsresized version(s)

Because of interception one interesting feature of Zion is that itdoes not require to fully store an image before the generation of itsresized version(s) as it occurs with AWS Lambda and S3 Since it isdone ldquoon-the-flyrdquo prior to storage Zion saves storage bandwidthAlthough of not much concern at first glance this property is veryinteresting for concurrently processing a vast collection of imagesas in [10] because IO savings add up rapidly for a large amount ofobjects

Signature verification To demonstrate the versatility of Zionwe proposed the signature verification use case Online contentchecking is again a perfect use case for data-driven functions sinceit requires an immediate response to the client

The main objective of this function is to verify that the objects(documents) that are uploaded are signed by the user and to verifythat the signature is valid ie the documents are authentic WithZion is possible to do this task in near-real time and notify the userinstantly in the case of rejection Also in the case of rejection we

prevent the archival of an inauthentic document thereby savingstorage space For the same reasons as above it is readily evidentthat this use case cannot be implemented with AWS Lambdas

The scenario is as follows The user obtains the public key fromher RSA key pair and she uploads it to a public container in theobject storage system Then the user signs the document with theprivate key and uploads it with the signature in a special header tothe object storage system which puts into execution the functionand verifies the document Note that in this case the function usesan explicit parameter (signature) described above

To do so we implemented a signature verification function Thefunction is mapped with a container to the onPut trigger Thereforeall object PUTs to this container will be enforced The function firstgets the public key from the object store based on the user who isuploading the object Then it loads the document content and itverifies the signature If it is valid the document is stored otherwisethe document is rejected sending an error message to the user

Interactive queries and result aggregation Finally interactivequeries is a use case that perfectly matches our data-driven modelWhen we want to perform fast data queries over existing datarepositories our data-driven model avoids moving the entire datasetto a computing cluster

For example object storage services are commonly used toarchive data like log files Businesses that want to extract fast in-sights from these data repositories using big data analytic toolsmust choose between two strategies 1 Moving data to the com-putation cluster to take advantage from data locality or 2 Usinga connector to allow data analytic tools to read data directly fromthe remote object storage With Zion we offer a third strategyCompute in the storage cluster using functions that filter data andaggregate results inline

The case example we present is a top-k query on access logsof UbuntuOne [16] a personal cloud service We want to obtaina list of the most active users and the total number of requestseach user sent The implementation of this query in Zion has twofunctions The first one receives a fragment of the log file and filtersrequests logs maintaining a counter for each user thus exploitingparallelism The other function receives as input the partial countersof the various instances of the first function and it performs theaggregation and sorting in order to produce the desired result Toimplement this mapreduce-like example we tapped into SwiftrsquosSLO-based data partitioning (see Section 41 for further details)

6 EVALUATIONFor confirming our suspicions we first studied how Swift behaveswhen some computation tasks are collocated in the same storagenode We then ran micro-benchmarks to measure the behavior andthe overheads of our functions We did so by running standaloneexperiments and the applications discussed in Section 5

61 System SetupOur experimental testbed consisted of a host (or client) with 2VCPUsand 4GB RAM On the server side we deployed Zion in an 8-machines rack OpenStack Swift (ocata version) installation formedby 2 proxy nodes Dell PowerEdge R320 with 12GB RAM and 6storage nodes Dell PowerEdge R320 with 8GB RAM (each one with

127

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

06

12

18

24

3

Respon

seTim

e(secon

ds) 1 Worker

2 Workers

3 Workers

4 Workers

(a) Transaction response time

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

20

40

60

80

100

CP

U U

sage

()

1 Worker

2 Workers

3 Workers

4 Workers

(b) Storage Node CPU Usage

Figure 3 Swift measurementMeasured response times (a) and CPU consumption for a given storage node (b) for getting objectsat a certain rate (transactions per second or TPS) for different values of Swift workers

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(a) 1 Object Server worker

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(b) 2 Object Server workers

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(c) 3 Object Server workers

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(d) 4 Object Server workers

Figure 4 Swift interference measurement These plots show how for a given storage node the swift object storage service behavesfor different number of workers and for different CPU loads of interference (Intf)

4 CPU cores) At the compute side the computation layer is com-posed by 3 nodes Dell PowerEdge R430 with 32GB RAM (each onewith 24 CPU cores) All the rack machines ran Ubuntu Server 1404The client host ran Ubuntu 16041 CloudImage [37]

62 Swift Resource ContentionWe first studied how collocation of compute and data affects SwiftTo do so in ldquoidealrdquo conditions we restricted this experiment to asingle storage node2 For this measurement we used a single proxynode Dell PowerEdge R320 with 12GB RAM and 1 storage nodeDell PowerEdge R320 with 8GB RAM

Base Swift The first experiment consisted of evaluating how Swiftnormally works (Fig 3) For this experiment we stored a bunchof random 10kB files into Swift Next using the unix httperftool we ran distinct workloads each one differing in the numberof transactions per second (TPS) We measured the resultant per-transaction response time (Fig 3a) and CPU usage (Fig 3b) of thestorage node

Technically Swift uses workers to handle a workload In Swift aworker is nothing but a thread that accepts requests Each worker

2 We note that the results are easy to extrapolate to larger Swift deployments

normally accepts 1 024 concurrent requests but it is a config-urable parameter This simple experiment confirmed us that whena worker exhausts the 100 of its corersquos resources the responsetime steeply increases due to the queuing delays For instance with1 worker the corersquos usage reaches 100 around 380 TPS the pointbeyond which the requests start to accumulate as shown in Fig3a This effect can be alleviated by starting new Swift workers asseen in this figure However since storage nodes have 4 cores onlySwift is limited to 4 workers And therefore a storage node willeventually end up experiencing resource contention sooner or latereven if there are no computations collocated with the data

Collocated Computation We repeated the same experiment butcollocating computations with the storage service The results inFig 4 show how the Swiftrsquos processing capacity diminishes as afunction of the CPU usage borrowed from the collocated tasks Forexample if Swift was restricted to use only 1 worker for requestprocessing collocated tasks would have almost no impact on SwiftHowever as the number of Swift workers increase to match thenumber of CPU cores the resource contention begins to producenegative effects in the storage system due to the interference fromthe collocated computations

128

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

Swiftrsquos best practices recommend to use as many workers asCPU cores Under a heavy use of the CPU by the collocated tasksthis figure shows that Swift request processing would be severelydiminished For instance when the collocated tasks consumed 80of the total CPU Swift processing fell to a 35 in this experimentThis result suggests that the right path to go is to move computationsoutside the storage nodes which is what we did with Zion

63 Application SetupNow we describe the specific evaluation setup for our example usecases As object stores can contain a huge diversity of object typesand sizes [16] we set up heterogeneous workloads for all use casesin order to verify the behavior of Zion under different scenarios

Content-level access control For testing this application weused the dataset described in Section 5 but trimmed down to differ-ent object sizes 100kB 1MB 10MB and 100MB As stated beforethe dataset content is filtered out according to the type of userthat requested the object To this aim we used Swift user roles toreturn only specific fields to each user The function associated tothe onAfterGet trigger reads from an implicit parameter the JSONstring containing the allowed fields for a specific userrsquos role (egage education marital-status) and then returns them to theuser

Compression It is well-known that the compression ratio of ob-jects affects resource consumption Objects that contain only zeroswill be compressed more quickly and consume less resources thancompressing a binary object To get rid of this issue we chose touse text documents with a similar compression ratio of around60 The documents were of sizes 10kB 100kB 1MB and 10MBrespectively

Image processing As in the previous case we tried to find themost common scenario for testing this function We focused onthose image sizes commonly used in static web pages Finally weused different jpg files of 90kB 400kB 800kB and 1200 kB andwe set an implicit parameter so that the function resizes the imageto the 50 of its original size

Signature verification A sample usage of this function may be toverify official documents and forms in a public institution For thisexperiment we used text documents also of different sizes 10 kB100 kB 1 MB and 10 MB These documents are signed with a RSAprivate key The experiment operates on PUT requests verifyingthe correction of the signature

Interactive data queries and result aggregation For this usecase we used different sizes of the publicly available UbuntuOnersquoslog file [16] 100MB 1GB and 10GB respectively We comparedZionrsquos functions execution time to those obtained using HadoopWe built a 10-node Hadoop cluster of commodity workstations4-core i5 at 253 GHZ and 16 GB of RAM For this application weissued a Pig query against 1 The log files stored in HDFS and 2Swift using the Apache Hadoop-Swift connector [5] For Zion first-stage filtering functions we picked chunk sizes of 10MB 32MBand 200MB for the 100MB 1GB and 10GB files respectively

Table 1 Function information

Application LOC Function Size

Content-level Access Control asymp 29 27 kBCompression asymp 8 18 kBImage processing asymp 17 23 kBSignature verification asymp 43 29 kBInteractive data query asymp 203 6 kB

08 085 09 095 1

Startup Time (seconds)

0

30

60

90

120

150

Numof

Dokcers

Figure 5 Zion runtime startup time Time in seconds that aDocker with the Zion Java runtime takes to start

64 Application characteristicsTable 1 shows information of our four Zion applications The LOCcolumn gives the number of lines of code required to execute thefunction The third column gives the Function size used for eachapplication From this table it can be seen that our functions arevery lightweight for the proposed applications

65 Results

Docker characteristics The Java runtime and then the functionsare run inside Docker containers in our prototype Consequentlythe first validation to do was to assess the impact of starting ourruntime within Docker containers If the startup time was too largeit would hamper the inline processing capabilities of Zion Ourexperiments however revealed that this is not the case

For this experiment we utilized a pre-compiled Docker imagewith our Java runtime integrated in it The experiment consisted oflaunching 1 000 containers at different rates in the compute clusterand measure the startup time and memory along with the amountof memory for keeping the runtime up and running over time

Fig 5 shows the results of this experiment As depicted startinga new Docker with the Java runtime takes between 085 minus 095seconds Regarding RAM consumption we got that each containerconsumes around 35MB of memory These results show how ourruntime takes 5X times less to start in comparison with the AWSLambdarsquos runtime which proves that Zion is lightweight enoughfor elastic inline processing As Zion utilizes a large pool of alreadystarted containers the start up time is typically negligible in manycases and only amounts to 09 secs if there are no free containersin the pool

129

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

001 0015 002 0025 003 0035 004

(a) Measured request times with Swift

001 0015 002 0025 003 0035 004

(b) Measured request times with Swift + Zion

Figure 6 Zion base overhead These boxplots show measuredtimes of requesting 10kB objects in Swift (a) and gettingthe same objects with a no-operation (nop) function troughZion (b)

Zion overhead As another basic experiment we studied the extraoverhead that Zion adds when running a function The overheadis the time needed by the interception middleware to redirect theobject to a compute node take an available container launch afunction inside the Docker and pass the object data through it Forthis experiment we made use of the function listed in Listing 1

This simple function iterates over the data without processingit For measuring the overhead we used a set of 10kB objects Wefirst launched 5 000 plain GET requests to measure the base timeneeded to complete a GET request Then we launched another5 000 GET requests for the same objects but in this case we set upthe function in Listing 1 to respond upon the onAfterGet triggerof the objects

The results are plotted in Fig 6 This figure shows that the Zionrsquosoverhead is 9 ms This time includes the penalty of 5 ms for theaddition of an internal hop to Swift (redirect the data through acompute node) plus 4 ms to determine whether and which functionto run inside a free Docker container in the pool when the requesttouches a compute node

Performance of collocated functions We evaluated here theperformance of our application functions For this evaluation weconducted a stress test for each combination of application objectsize and number of function workers As our storage nodes have 4CPU cores we ran each function on 1 2 3 and 4 function workersrespectively The Docker containers were set up as follows 1 CPUcore and 512MB of RAM per worker The results are depicted inFig 7-10

The first experiments were done collocating the containers inthe storage nodes without using the Zion computation layer Themain objective of this experiment was 1 To show how collocationlimits the number of TPS and 2 To show the performance of Zionrsquosfunctions using 1 CPU core per function worker As a side effectnotice that the latter makes it possible to faithfully generalize our

200 300 400 5000

500

1000

1500

2000

Respon

setime(m

s)

10 kB

1 worker 2 workers 3 workers 4 workers

200 300 400 5000

500

1000

1500

2000

100kB

40 60 80 100

TPS

0

500

1000

1500

Respon

setime(m

s)

1MB

5 10 15 20 25 30

TPS

0

1000

2000

3000

4000

10MB

Figure 7 Compression function performance

20 40 60 80 1000

500

1000

1500

2000

1 MB

5 10 15 20

Transactions per Second (TPS)

0

1000

2000

3000

Respon

setime(m

s)

10 MB

2 4 6 8 10

Transactions per Second (TPS)

0

2000

4000

6000

8000

100 MB

0 100 200 300 400 5000

500

1000

1500

2000

Respon

setime(m

s)

100 kB

1 worker 2 workers 3 workers 4 workers

Figure 8 Content-level access control function perfor-mance

results to any number of cores We empirically verified this for theimage resize use case As the cluster is made up of 6 storage nodesthe function with 1 worker was able to process around 9 images persecond for the 400kB image case and around 54 images per secondin the entire cluster For the rest of applications we utilized onlyone storage node The reason is that to tap into the performance ofthe entire cluster we would have to replicate 6 times each object toensure that a copy of it is available at every storage node In activestorage the total computation power for an object is proportionalto the number of replicas it has

Notice that in Fig 7- 10 the curves representing the responsetime start to increase when the resources of the Docker containerare overloaded ie when all the CPU cores are at the 100 of theircapacity These experiments also show how the object size is veryimportant In all experiments the higher the object size the lowerthe number of TPS a function can handle To wit for the content-level access control application (Fig 8) the function is unable tohandle more than 1 TPS for the 100MB object irrespective of thenumber of workers Concretely this task takes around 16 secondsto complete The same occurs with the image resizing function We

130

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

10 20 30 40 50 600

1000

2000

3000

4000

Respon

setime(m

s)

90 kB

1 worker 2 workers 3 workers 4 workers

10 20 30 400

500

1000

1500

2000

400 kB

5 10 15 20

Transactions per Second (TPS)

0

1000

2000

3000

4000

Respon

setime(m

s)

800 kB

2 4 6 8 10

Transactions per Second (TPS)

0

1

2

3

4times104

1200 kB

Figure 9 Image resizer function performance

100 150 200 250 3000

500

1000

1500

2000

Respon

setime(m

s)

10 kB

1 worker 2 workers 3 workers 4 workers

50 100 150 200 2500

500

1000

1500

2000

100 kB

20 40 60 80 100 120

Transactions per Second (TPS)

0

500

1000

1500

Respon

setime(m

s)

1 MB

5 10 15 20 25 30

Transactions per Second (TPS)

0

500

1000

1500

2000

10 MB

Figure 10 Signature verification function performance

saw that resizing an image of 1200kB takes around 236 secondsleading to lt 1 TPS

A function that needs more than 1 second to process an objectit can only handle in parallel as many objects as functions workersare running These examples show that function collocation at thestorage nodes would be in any case insufficient to absorb a burst oftransactions for more than one second Also they demonstrate thatit is almost impossible to predict resource consumption ahead oftime because resources depend on the object size Consequentlyresource management for collocated functions should be dynamicwhich is hard to achieve in practice instead of a simple resourcemanagement policy such as one single worker per CPU core

Performance of non-collocated functions As resource con-tention is not an issue with non-collocated functions what is keyhere is to verify that Zionrsquos storage-disaggregated functions areindeed scalable

In order to test how the Zionrsquos compute layer provides betterscalability than the storage nodes we extended the prior evaluationto non-collocated functions Our compute nodes have 24 cores eachone This means that is possible to start up to 24 function workers in

0 1 2 3 4 5 6 7 810

20

30

40

Max

TPS

90 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 85

10

15

20

25

400 kB

0 1 2 3 4 5 6 7 8

Workers

0

2

4

6

8

Max

TPS

800 kB

0 1 2 3 4 5 6 7 8

Workers

-1

0

1

2

3

1200 kB

Figure 11 Image resizer function scalability

0 1 2 3 4 5 6 7 8100

150

200

250

300

Max

TPS

10 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 850

100

150

200

250

100 kB

0 1 2 3 4 5 6 7 8

Workers

0

50

100

150

200

Max

TPS

1 MB

0 1 2 3 4 5 6 7 8

Workers

6

8

10

12

14

10 MB

Figure 12 Signature verification function scalability

each node However to make the comparative fair with the storagenodes we utilized between 1 to 8 function workers

As in the previous experiments we used 1 proxy and 1 storagenode in addition to 1 compute node which is enough to assessthe scalability of Zion The results are shown in Fig 13- 12 In thiscase we recorded the number of maximum transactions per second(Max TPS) that each worker was able to handle with and withoutcollocation

First we can see how in almost all cases with 1 to 4 functionworkers non-collocated functions can handle more transactionsthan the storage nodes This is due to the fact that the computenodes are more powerful than the storage nodes in terms of CPUTherefore the capacity of ingestion is slightly higher Second themaximum TPS that a storage node can handle is always limitedabove by the number of available cores That is spawning moreworkers has no benefit because they will contend for the availableresources This is the reason why the curve for collocated functions(gray line) flattens out beyond 4 workers in all use cases

However non-collocated functions (black line) can continue toprocess transactions by spawning more workers to respond to thedemand By a quick inspection of all the figures the scalability ofnon-collocated functions is practically linear with the number of

131

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

0 1 2 3 4 5 6 7 8200

400

600

800

1000

Max

TPS 10 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 80

200

400

600

800

100 kB

0 1 2 3 4 5 6 7 8

Workers

20

40

60

80

Max

TPS 1 MB

0 1 2 3 4 5 6 7 8

Workers

0

5

10

15

10 MB

Figure 13 Compression function scalability

0 1 2 3 4 5 6 7 80

200

400

600

Max

TPS

100 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 820

40

60

80

1 MB

0 1 2 3 4 5 6 7 8

Workers

0

5

10

Max

TPS

10 MB

0 1 2 3 4 5 6 7 8

Workers

0

1

2

3

100 MB

Figure 14 Content-level access control function scalability

workers This suggests that disaggregating storage and compute is apractical solution to scale out computation in cloud object stores

Interactive data queries Table 2 compares the execution timesfor the same query for Hadoop and Zion (Listing 2) The entries ofthis table were obtained by running 30X each configuration andthen averaging the results

Non-surprisingly the Hadoop configuration that ingests the datafrom the remote Swift service is the one that presents the highestmean execution time This occurred due to network contentionBoth clusters were geographically far apart so communicationwent through some FAST Ethernet links within our institutionrsquosLAN More concretely it took asymp 917 seconds to transfer the entire10GB dataset and asymp 86 seconds to transfer the 1GB dataset

The configuration that read the data from HDFS had better timesspecially for the large datasets However it has the disadvantagethat it requires pre-loading all the data in HDFS which may be notpractical in many cases

For Zion functions the execution times were comparativelysmall despite using at most 24 workers This suggests that bettertimes can be easily attained with more workers This is clearly seenwhen comparing Zion with 12 and 24 workers respectively For

Table 2 Interactive data queries execution times

Configuration File size Chunk size Time

Pig query - Swift 100 MB 10 MB 816sPig query - HDFS 100 MB 10 MB 714sZion - 12 Workers 100 MB 10 MB 0510sZion - 24 Workers 100 GB 10 MB 0348s

Pig query - Swift 1 GB 32 MB 156sPig query - HDFS 1 GB 32 MB 754sZion - 12 Workers 1 GB 32 MB 4183sZion - 24 Workers 1 GB 32 MB 2256s

Pig query - Swift 10 GB 200 MB 985sPig query - HDFS 10 GB 200 MB 946sZion - 12 Workers 10 GB 200 MB 26253sZion - 24 Workers 10 GB 200 MB 13392s

the 10GB dataset and 12 workers it took 2X much more time thanwith 24 workers Also this experiment confirms the scalability ofZion without incurring in resource contention in the storage clusterIf we had executed this interactive query with 24 workers in thestorage nodes the storage cluster CPU would had reached 100 inall storage nodes for 134 seconds leading to resource contention

Listing 2 Evaluated query in Hadoop and Swift + Zion clus-terss e l e c t u s e r _ i d count ( lowast ) t o t a l

where ( r e q _ t = GetContentResponse or r e q _ t = PutContentResponse )

and msg= Reques t done group by u s e r _ i d order by t o t a l DESC l imi t 10

7 CONCLUSIONThis paper presents Zion an innovative data-driven serverlesscomputing middleware for object storage Unlike commercial event-driven serverless models our data-driven functions intercept andoperate over objects as they are readwrite fromto the object storeSince Zion is serverless it overcomes the scalability and resourcecontention problems of active storage without the need to managea server or a runtime environment

By injecting computations in the data pipeline Zion is ideal foruse cases that require synchronous interactions with external usersExamples of these use cases include (but are not limited to) dynamiccontent generation interactive queries personalization contentverification and access control In many of them the data localityof our inline computations contributes to optimize latency and toreduce data transfers outside the data center

ACKNOWLEDGMENTSThis work has been partially funded by the European Union Horizon2020 Framework Programme in the context of the project IOStackSoftware-defined Storage for Big Data (H2020-644182) and by theSpanish Ministry of Science and Innovation (Grant TIN2016-77836-C2-1-R)

132

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

REFERENCES[1] Amazon 2017 Athena httpsawsamazoncomathena (2017)[2] Amazon 2017 Lambda httpsawsamazoncomlambda (2017)[3] Amazon 2017 Redshift Spectrum httpsawsamazoncomredshiftspectrum

(2017)[4] Apache 2012 HBASE Coprocessors httpsblogsapacheorghbaseentry

coprocessor_introduction (2012)[5] Apache 2017 Hadoop OpenStack Support Swift Object Store httpshadoop

apacheorgdocsstablehadoop-openstackindexhtml (2017)[6] Krste Asanovic and D Patterson 2014 Firebox A hardware building block for

2020 warehouse-scale computers In FAST[7] Prosunjit Biswas Farhan Patwa and Ravi Sandhu 2015 Content Level Access

Control for OpenStack Swift Storage In ACM CODASPY 123ndash126[8] Chao Chen Yong Chen and Philip C Roth 2012 Dosas Mitigating the resource

contention in active storage systems In CLUSTER 164ndash172[9] Facebook 2017 Presto Distributed SQL Engine for Big Data httpsprestodbio

(2017)[10] Sadjad Fouladi Riad S Wahby Brennan Shacklett Karthikeyan Vasuki Balasubra-

maniam William Zeng Rahul Bhalerao Anirudh Sivaraman George Porter andKeith Winstein 2017 Encoding Fast and Slow Low-Latency Video ProcessingUsing Thousands of Tiny Threads In NSDI

[11] Peter X Gao Akshay Narayan Sagar Karandikar Joao Carreira Sangjin HanRachit Agarwal Sylvia Ratnasamy and Scott Shenker 2016 Network require-ments for resource disaggregation In OSDI 249ndash264

[12] Google 2009 BigTable Coprocessors httpswwwcscornelleduprojectsladis2009talksdean-keynote-ladis2009pdf (2009)

[13] Google 2017 Cloud Functions httpscloudgooglecomfunctionsdocs (2017)[14] Rauacutel Gracia-Tinedo Pedro Garciacutea-Loacutepez Marc Saacutenchez-Artigas Josep Sampeacute

Yosef Moatti Eran Rom Dalit Naor Ramon Nou Toni Corteacutes Pietro Michiardiand William Oppermann 2016 IOSTACK Software-defined Object StorageIEEE Internet Computing (2016)

[15] Rauacutel Gracia-Tinedo Josep Sampeacute Edgar Zamora Marc Saacutenchez-Artigas PedroGarciacutea-Loacutepez Yosef Moatti and Eran Rom 2017 Crystal Software-DefinedStorage for Multi-Tenant Object Stores In FAST

[16] Rauacutel Gracia-Tinedo Yongchao Tian Josep Sampeacute Hamza Harkous John LentonPedro Garciacutea-Loacutepez Marc Saacutenchez-Artigas and Marko Vukolic 2015 DissectingUbuntuOne Autopsy of a Global-scale Personal Cloud Back-end In IMC 155ndash168

[17] Scott Hendrickson Stephen Sturdevant Tyler Harter VenkateshwaranVenkataramani Andrea C Arpaci-Dusseau and Remzi H Arpaci-Dusseau 2016Serverless computation with OpenLambda In HotCloud

[18] HP 2017 The Machine Our vision for the Future of Computing httpswwwlabshpecomthe-machine (2017)

[19] IBM 2017 OpenWhisk httpsdeveloperibmcomopenwhisk (2017)[20] Eric Jonas Shivaram Venkataraman Ion Stoica and Benjamin Recht 2017

Occupy the Cloud Distributed computing for the 99 arXiv preprintarXiv170204024 (2017)

[21] Microsoft 2017 Azure Functions httpsazuremicrosoftcomen-usservicesfunctions (2017)

[22] Netflix 2016 Evolution of the Netflix Data Pipeline httptechblognetflixcom201602evolution-of-netflix-data-pipelinehtml (2016)

[23] OpenStack 2017 Nova httpsdocsopenstackorgnova (2017)[24] OpenStack 2017 Swift httpsdocsopenstackorgswift (2017)[25] OpenStack 2017 Swift Large Objects httpsdocsopenstackorgswiftlatest

overview_large_objectshtml (2017)[26] Juan Piernas Jarek Nieplocha and Evan J Felix 2007 Evaluation of active storage

strategies for the lustre parallel file system In SC 28[27] Lingjun Qin and Dan Feng 2006 Active Storage Framework for Object-based

Storage Device In AINA 97ndash101[28] Simona Rabinovici-Cohen Ealan Henis John Marberg and Kenneth Nagin 2014

Storlet engine performing computations in cloud storage Technical Report IBMTechnical Report H-0320 (August 2014)

[29] Paul Rad Van Lindberg Jeff Prevost Weining Zhang and Mo Jamshidi 2014ZeroVM secure distributed processing for big data analytics In WAC 1ndash6

[30] Redis 2017 Redis httpsredisio (2017)[31] Erik Riedel Garth A Gibson and Christos Faloutsos 1998 Active Storage for

Large-Scale Data Mining and Multimedia In VLDB 62ndash73[32] Michael T Runde Wesley G Stevens Paul A Wortman and John A Chandy 2012

An active storage framework for object storage devices In MSST 1ndash12[33] Josep Sampeacute 2017 Zion framework httpsgithubcomJosepSampezion

(2017)[34] Josep Sampeacute Pedro Garciacutea-Loacutepez and Marc Saacutenchez-Artigas 2016 Vertigo

Programmable micro-controllers for software-defined object storage In IEEECLOUD 180ndash187

[35] Seung Woo Son Samuel Lang Philip Carns Robert Ross Rajeev Thakur BerkinOzisikyilmaz Prabhat Kumar Wei-Keng Liao and Alok Choudhary 2010 En-abling active storage on parallel IO software stacks In MSST 1ndash12

[36] Zhipeng Tan Yanli Yuan Dan Feng Tian Zhan and Xudong Tu 2011 Imple-mentation of method object in active object-based storage systems In CCIS204ndash211

[37] Ubuntu 2017 Cloud image repository httpscloud-imagesubuntucom (2017)[38] Ubuntu 2017 Docker repository httpshubdockercomr_ubuntu (2017)[39] Irvine University of California 1996 Adult Data Set httparchiveicsuciedu

mldatasetsAdult (1996)[40] Himanshu Vashishtha and Eleni Stroulia 2011 Enhancing query support in

hbase via an extended coprocessors framework In ServiceWave 75ndash87[41] Rajiv Wickremesinghe Jeffrey S Chase and Jeffrey Scott Vitter 2002 Distributed

computing with load-managed active storage In HPDC 13ndash23[42] Yulai Xie Dan Feng Yan Li and Darrell DE Long 2016 Oasis an active storage

framework for object storage platform Future Generation Computer Systems 56(2016) 746ndash758

133

  • Abstract
  • 1 Introduction
  • 2 Related Work
  • 3 Design Overview
    • 31 Interception Software and Metadata Service
    • 32 Computation Layer
      • 4 Implementation
        • 41 Interception Software and Metadata Service
        • 42 Computation Layer
          • 5 Applications
          • 6 Evaluation
            • 61 System Setup
            • 62 Swift Resource Contention
            • 63 Application Setup
            • 64 Application characteristics
            • 65 Results
              • 7 Conclusion
              • Acknowledgments
              • References
Page 5: Data-Driven Serverless Functions for Object Storagefaculty.washington.edu/wlloyd/courses/tcss562... · using AWS Lambda for data-intensive tasks over disaggregated storage [10, 20]

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

Recall that as part of the metadata Zion also includes configura-tion information for the functions such as the CPU and memoryrequirements and cancellation timeouts as we already discussedin the preceding section

Function assignment granularity Although OpenStack Swiftstores the objects in a flat name space is possible to simulate ahierarchical structure within a single container by adding forwardslash characters () in the object name Each folder between thecontainer and object names is called a pseudo-folder in Swift Forexample in the object name imageszionjpg the prefix imagesis the pseudo-folder

In Zion we take advantage of this logical hierarchy to enablefunction assignment at per-object pseudo-folder and container lev-els Moreover we also enable mappings at suffix level for exampleto run functions to all objects whose name ends with jpg

Function execution After the function assignment process whena request arrives for an object the Zionrsquos middleware with the helpof the metadata service checks if that request triggers the executionof a function For example if the request is a PUT the middlewarewill launch an onPut trigger for the request If there are functionsthat respond to this trigger the middleware will immediately for-ward the request to an available container Otherwise the requestwill follow the default readwrite path

Parallel processingwith functions In all object stores there is alimit in the maximum allowed object size In Swift this limit is 5GBTo bypass this limitation Swift uses a special object called StaticLarge Object [25] (SLO) SLOs are objects split into differentparts The user must upload these parts together with a specialobject called manifest which contains the location of the objectparts Getting an object is totally transparent to the users whomake a request to the manifest and the whole object is returned asif it was stored as a single object in Swift

This Swift SLO approach enables Zion to associate a differentfunction to the manifest and to the parts With this mechanism itis possible to create a highly parallel and distributed computationalsubstrate by executing a function to each dataset part and finallyby running a reduction function to the filtered parts In Section 5we present a Hadoop-like use case for object storage implementedwith our functions This is aligned with the recent trend of largescale data analytics with serverless computing [10 20]

42 Computation LayerIn our implementation the computation layer is composed by apool of compute nodes They are located between the proxies andthe storage nodes as shown in Fig 1 We note at this moment thatthe compute nodes assigned to Zion are not shared with OpenStackNova [23] they are exclusively managed by Zion

A general overview of how Zion operates is the following Eachfunction is run inside a separate Docker container what is called aworker At the time of this writing Zionrsquos runtime is Java-basedAnd consequently every function is run in a Java Virtual Machine(JVM) At a very high level a worker can be viewed as a containerrunning a specific function Every new invocation to the functionis handled by a new thread inside the JVM

421 Functions As the Zionrsquos runtime is based on Java Zionrsquosfunctions are also written in Java This means that functions areplain Java classes They have an entry point called invoke whichcontains the main code of the function and that it is called uponevery new request The invoke method has two arguments of typeContext and API which are made available automatically by theZionrsquos runtime on every new request

The Context encapsulates the access to the request headers theobjectrsquos metadata and the objectrsquos data stream It also includes alogger facility for logging the execution of the function The APIenables access to external services By default the isolation level ofDocker containers precludes functions from having any access toexternal resources Through the API a function can access to someexternal services such as RabbitMQ and Redis in addition to Swiftitself The API class is extensible and has been framed to facilitateexternal interaction to future services

Here we have the implementation code of a simple function thatiterates over the data with no further processing

Listing 1 A function that echoes the data passed to itpublic c l a s s Handler implements I F u n c t i o n

public void invoke ( Contex t c tx API a p i ) while ( ( d a t a = c t x o b j e c t s t ream read ( ) ) )

c t x o b j e c t s t ream w r i t e ( d a t a )

422 Compute Node The complete architecture for computenodes is depicted in Fig 2 and consists of

Containers We use Docker containers to sandbox functions sothey have neither direct network access nor system execution andthread creation capabilities nor access to the local file system Onlya minimum ephemeral hard disk space is provided under the tmpdirectory to be used by functions We used the Ubuntu 1604 Dockerimage [38] for our implementation

Zion Service The Zion service is a WSGI server whose missionis to spawn workers for processing the request forwarded by theinterception middleware Consequently it must be up and runningin all compute nodes because as described in 322 it is responsiblefor managing containers and executing functions When a functionis running this service waits until receiving a response from itThe waiting time is limited by the default systemrsquos timeout or by auser-defined timeout specified at deployment time

Runtime For the runtime we created a Docker image based onUbuntu 1604 with the Java runtime installed on it to rapidly spawnnew containers and execute functions

As stated in previous section each different function is run in aseparate container If the compute node does not have any workeravailable for that function the Zion service takes a Docker con-tainer from the pool of containers and loads the function code andlibraries Subsequent requests will then be able to reuse the sameworker when processing the same ldquohotrdquo function

One feature of our functions is that it is possible to modifythe objectrsquos metadata (keyvalue) via the Context By defaultSwift stores 3 copies of each object However Swift is an eventuallyconsistent system This means that at a given time the replicas

125

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

Zion Service

Java Runtime

Java VM

Function 1

Function 1Worker 1

Java Runtime

Java VM

Function 1

Function 1Worker 2

Java Runtime

Java VM

Function 2

Function 2Worker 1

ToFrom Proxy TF Storage Node

Java Runtime

Docker waiting for a new function worker

Figure 2 Zion Compute Node High level architectureoverview of the components of a Zion Compute Node

of an object may have different data and metadata As functionsare stateless ie there is no relation between the different functioninvocations even if they occur in the same worker modifying thelocal copy of an objectrsquos metadata implies waiting for Swift toupdate all the replicas This behavior is clearly undesirable in termsof functionrsquos parallelism

To mitigate this issue each function worker is directly attachedto a distributed metadata caching system based on Redis It is aninternal feature totally transparent to the users who develop func-tions When a function is executed it initially loads the objectrsquosmetadata into the cache and then performs all the metadata modi-fications over it If there are other requests to the same object theother function invocations will use the previously-loaded metadataWhen the function terminates it offloads the metadata from thecache to all objectrsquos replicas

With this distributed cache we ensure that the objectsrsquo metadatatouched by the functions is always consistent in all replicas evenif there are multiple requests at the same time With it functionscan even do atomic operations over the metadata fields

5 APPLICATIONSHere we show some of the applications that Zion can support andthe ease with which these applications can be built on top of SwiftAs expected the described use cases are not arbitrary they havebeen chosen to show the potential of our data-driven model and inparticular for synchronous interactions with the object store Allof them have been implemented over Swift and evaluated in thenext section

Content-level Access Control With Zion it is extremely easyto implement sophisticated forms of access control based upon thecontents of the objects themselves This example is ideal to showthe limitations of AWS Lambda First it clear that access controlrequires the interception of the GET requests which can only bedone indirectly with the help of the API Gateway service Secondthe processing of the objectrsquos content to satisfy the access controlpolicy must be done inline as the user needs a timely notificationabout the status of her request which again cannot be realizedwith AWS Lambda Finally Zionrsquos design enables access controlto leverage data locality and improve response time as functions

go through the contents of the object as it is read from the storagenode

Typically access control in object storage operates at the gran-ularity of containers and hence once an object is accessible tosome party he gets the full content of the object Swift also followsthis ldquoall or nothingrdquo approach where the access to objects insidea container is enforced through access control lists (ACLs) whichmay be insufficient when objects contain sensitive content

In the exercise to show another capability of our framework weshow how content level access control can be realized very easilyin Swift thanks to our function abstraction in a similar fashion butcleaner than [7] By ldquocontent levelrdquo we mean that Swift users will beable to access to certain parts of an object based on their credentialsTo give a concrete example consider the publicly available Adultdataset from the UCI Machine Learning Repository [39] which con-tains about 48 000 rows of census information Each row containsattributes like race sex and marital-status which combinedwith explicit identifiers such as the SSN1 that identify the recordholders may leak sensitive information about a given individualAs a result the records of this object should be accessed differ-ently depending upon the user role For instance while a ldquopoliceagentrdquo should be able to access to all fields SSN age educationmarital-status race sex relationship capital-gain andnative-country a ldquocensus analystrdquo could be restricted to get onlya smaller view age education capital-gain native-country

To implement this example we have linked a function to theonAfterGet object trigger to enforce content level access controlon the object adult_datacsv We have defined a simple accesspolicy that depending on the use role ldquopolice agentrdquoor ldquocensusanalystrdquo allows to get all the fields or just an smaller projectionview This simple access policy has been stored as implicit parameterof the function ie in a JSON formatted string uploaded when welinked the object with the function as explained in 321 When aGET request comes for the object adult_datacsv the proxy firstchecks the Swift ACL If the object is accessible by that user thefunction then reads the content level policy and filters the dataonly if the user has the appropriate role

Compression A typical data reduction task is the compressionof objects In general any dynamic content filtering that involvesinline transformations of the content is ideally suited for our data-driven model A data-driven task could compressdecompress a filedynamically ldquoon-the-flyrdquo As Zion acts directly on the data pipelineie as the object is readwrite to the object store the result willbe either latency improvement or space reduction depending uponwhether Zion intercepts a GET request or a PUT request

Here we will merely consider the case of compressing incomingobjects ldquoon-the-flyrdquo which means that upon a GET request by theuser the target object will have to undergo decompression Thecommon approach for doing so in Swift is to implement an ad-hoccompression middleware However this approach is problematicFirst by running compression on the storage nodes compressionis repeated as many times are replicas there are By offloading it tothe proxies we can disrupt Swiftrsquos normal operation under heavy

1As the Adult dataset does not contain explicit identifiers we added a random SSN toeach row using the Fake Name Generator

126

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

load since the proxies are responsible for looking up the locationof objects in the rings and routing the requests accordingly

With Zion we can easily write down a compression functionand execute it between the proxies and the storage nodes in a scal-able way without worrying about resources or repeating the taskmany times In addition to the possibility to intercept GET requestsZion has another advantage over AWS Lambda It is capable to runcompression over objects whose total compression time exceed thefive minutesrsquo limit This is because Zion cancels a request only ifthe time for the receipt of the first byte from the function exceeds atimeout value (see Section 321 for details) This model is ideal foroperations such as compression that can be run as data is readwritewhich have been the focus of active storage for a long time

To implement the compression function we utilized gzip andthen we mapped it to a container with the onPut trigger As suchall writes to this container will be compressed We did the reverseprocess for the onAfterGet trigger so that when a user requestedan object of this container she would get the original uncompressedversion To do so we made use of an implicit parameter to tell thefunction what to do either to compress or decompress That is forthe onPut trigger the implicit parameter was set to ldquocompressionrdquoFor the onAfterGet trigger the implicit parameter value was setto ldquodecompressionrdquo

Image processing One of the archetypal use cases of serverlesscomputing is that of image resizing for we found it interesting toevaluate it here It is ideal to show the potentials of asynchronousevent-based functions such as AWS Lambdas and also very usefulfor tenants that use the object store as a back-end for web imagestorage In this case when the images are uploaded a function istriggered resizing and creating all the needed images for examplefor the different possible devices that can request the main web(phone tablet PC etc)

We did the same with Zion and coded a function that resizes animage to an arbitrary percentage To intercept the PUT requestswe linked it to an onPut trigger and specified jpg as the suffixof the object name So all objects whose name ended with jpgwere rescaled As Zion allows to create new objects as part of thefunctionrsquos output the function stores the original object and itsresized version(s)

Because of interception one interesting feature of Zion is that itdoes not require to fully store an image before the generation of itsresized version(s) as it occurs with AWS Lambda and S3 Since it isdone ldquoon-the-flyrdquo prior to storage Zion saves storage bandwidthAlthough of not much concern at first glance this property is veryinteresting for concurrently processing a vast collection of imagesas in [10] because IO savings add up rapidly for a large amount ofobjects

Signature verification To demonstrate the versatility of Zionwe proposed the signature verification use case Online contentchecking is again a perfect use case for data-driven functions sinceit requires an immediate response to the client

The main objective of this function is to verify that the objects(documents) that are uploaded are signed by the user and to verifythat the signature is valid ie the documents are authentic WithZion is possible to do this task in near-real time and notify the userinstantly in the case of rejection Also in the case of rejection we

prevent the archival of an inauthentic document thereby savingstorage space For the same reasons as above it is readily evidentthat this use case cannot be implemented with AWS Lambdas

The scenario is as follows The user obtains the public key fromher RSA key pair and she uploads it to a public container in theobject storage system Then the user signs the document with theprivate key and uploads it with the signature in a special header tothe object storage system which puts into execution the functionand verifies the document Note that in this case the function usesan explicit parameter (signature) described above

To do so we implemented a signature verification function Thefunction is mapped with a container to the onPut trigger Thereforeall object PUTs to this container will be enforced The function firstgets the public key from the object store based on the user who isuploading the object Then it loads the document content and itverifies the signature If it is valid the document is stored otherwisethe document is rejected sending an error message to the user

Interactive queries and result aggregation Finally interactivequeries is a use case that perfectly matches our data-driven modelWhen we want to perform fast data queries over existing datarepositories our data-driven model avoids moving the entire datasetto a computing cluster

For example object storage services are commonly used toarchive data like log files Businesses that want to extract fast in-sights from these data repositories using big data analytic toolsmust choose between two strategies 1 Moving data to the com-putation cluster to take advantage from data locality or 2 Usinga connector to allow data analytic tools to read data directly fromthe remote object storage With Zion we offer a third strategyCompute in the storage cluster using functions that filter data andaggregate results inline

The case example we present is a top-k query on access logsof UbuntuOne [16] a personal cloud service We want to obtaina list of the most active users and the total number of requestseach user sent The implementation of this query in Zion has twofunctions The first one receives a fragment of the log file and filtersrequests logs maintaining a counter for each user thus exploitingparallelism The other function receives as input the partial countersof the various instances of the first function and it performs theaggregation and sorting in order to produce the desired result Toimplement this mapreduce-like example we tapped into SwiftrsquosSLO-based data partitioning (see Section 41 for further details)

6 EVALUATIONFor confirming our suspicions we first studied how Swift behaveswhen some computation tasks are collocated in the same storagenode We then ran micro-benchmarks to measure the behavior andthe overheads of our functions We did so by running standaloneexperiments and the applications discussed in Section 5

61 System SetupOur experimental testbed consisted of a host (or client) with 2VCPUsand 4GB RAM On the server side we deployed Zion in an 8-machines rack OpenStack Swift (ocata version) installation formedby 2 proxy nodes Dell PowerEdge R320 with 12GB RAM and 6storage nodes Dell PowerEdge R320 with 8GB RAM (each one with

127

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

06

12

18

24

3

Respon

seTim

e(secon

ds) 1 Worker

2 Workers

3 Workers

4 Workers

(a) Transaction response time

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

20

40

60

80

100

CP

U U

sage

()

1 Worker

2 Workers

3 Workers

4 Workers

(b) Storage Node CPU Usage

Figure 3 Swift measurementMeasured response times (a) and CPU consumption for a given storage node (b) for getting objectsat a certain rate (transactions per second or TPS) for different values of Swift workers

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(a) 1 Object Server worker

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(b) 2 Object Server workers

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(c) 3 Object Server workers

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(d) 4 Object Server workers

Figure 4 Swift interference measurement These plots show how for a given storage node the swift object storage service behavesfor different number of workers and for different CPU loads of interference (Intf)

4 CPU cores) At the compute side the computation layer is com-posed by 3 nodes Dell PowerEdge R430 with 32GB RAM (each onewith 24 CPU cores) All the rack machines ran Ubuntu Server 1404The client host ran Ubuntu 16041 CloudImage [37]

62 Swift Resource ContentionWe first studied how collocation of compute and data affects SwiftTo do so in ldquoidealrdquo conditions we restricted this experiment to asingle storage node2 For this measurement we used a single proxynode Dell PowerEdge R320 with 12GB RAM and 1 storage nodeDell PowerEdge R320 with 8GB RAM

Base Swift The first experiment consisted of evaluating how Swiftnormally works (Fig 3) For this experiment we stored a bunchof random 10kB files into Swift Next using the unix httperftool we ran distinct workloads each one differing in the numberof transactions per second (TPS) We measured the resultant per-transaction response time (Fig 3a) and CPU usage (Fig 3b) of thestorage node

Technically Swift uses workers to handle a workload In Swift aworker is nothing but a thread that accepts requests Each worker

2 We note that the results are easy to extrapolate to larger Swift deployments

normally accepts 1 024 concurrent requests but it is a config-urable parameter This simple experiment confirmed us that whena worker exhausts the 100 of its corersquos resources the responsetime steeply increases due to the queuing delays For instance with1 worker the corersquos usage reaches 100 around 380 TPS the pointbeyond which the requests start to accumulate as shown in Fig3a This effect can be alleviated by starting new Swift workers asseen in this figure However since storage nodes have 4 cores onlySwift is limited to 4 workers And therefore a storage node willeventually end up experiencing resource contention sooner or latereven if there are no computations collocated with the data

Collocated Computation We repeated the same experiment butcollocating computations with the storage service The results inFig 4 show how the Swiftrsquos processing capacity diminishes as afunction of the CPU usage borrowed from the collocated tasks Forexample if Swift was restricted to use only 1 worker for requestprocessing collocated tasks would have almost no impact on SwiftHowever as the number of Swift workers increase to match thenumber of CPU cores the resource contention begins to producenegative effects in the storage system due to the interference fromthe collocated computations

128

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

Swiftrsquos best practices recommend to use as many workers asCPU cores Under a heavy use of the CPU by the collocated tasksthis figure shows that Swift request processing would be severelydiminished For instance when the collocated tasks consumed 80of the total CPU Swift processing fell to a 35 in this experimentThis result suggests that the right path to go is to move computationsoutside the storage nodes which is what we did with Zion

63 Application SetupNow we describe the specific evaluation setup for our example usecases As object stores can contain a huge diversity of object typesand sizes [16] we set up heterogeneous workloads for all use casesin order to verify the behavior of Zion under different scenarios

Content-level access control For testing this application weused the dataset described in Section 5 but trimmed down to differ-ent object sizes 100kB 1MB 10MB and 100MB As stated beforethe dataset content is filtered out according to the type of userthat requested the object To this aim we used Swift user roles toreturn only specific fields to each user The function associated tothe onAfterGet trigger reads from an implicit parameter the JSONstring containing the allowed fields for a specific userrsquos role (egage education marital-status) and then returns them to theuser

Compression It is well-known that the compression ratio of ob-jects affects resource consumption Objects that contain only zeroswill be compressed more quickly and consume less resources thancompressing a binary object To get rid of this issue we chose touse text documents with a similar compression ratio of around60 The documents were of sizes 10kB 100kB 1MB and 10MBrespectively

Image processing As in the previous case we tried to find themost common scenario for testing this function We focused onthose image sizes commonly used in static web pages Finally weused different jpg files of 90kB 400kB 800kB and 1200 kB andwe set an implicit parameter so that the function resizes the imageto the 50 of its original size

Signature verification A sample usage of this function may be toverify official documents and forms in a public institution For thisexperiment we used text documents also of different sizes 10 kB100 kB 1 MB and 10 MB These documents are signed with a RSAprivate key The experiment operates on PUT requests verifyingthe correction of the signature

Interactive data queries and result aggregation For this usecase we used different sizes of the publicly available UbuntuOnersquoslog file [16] 100MB 1GB and 10GB respectively We comparedZionrsquos functions execution time to those obtained using HadoopWe built a 10-node Hadoop cluster of commodity workstations4-core i5 at 253 GHZ and 16 GB of RAM For this application weissued a Pig query against 1 The log files stored in HDFS and 2Swift using the Apache Hadoop-Swift connector [5] For Zion first-stage filtering functions we picked chunk sizes of 10MB 32MBand 200MB for the 100MB 1GB and 10GB files respectively

Table 1 Function information

Application LOC Function Size

Content-level Access Control asymp 29 27 kBCompression asymp 8 18 kBImage processing asymp 17 23 kBSignature verification asymp 43 29 kBInteractive data query asymp 203 6 kB

08 085 09 095 1

Startup Time (seconds)

0

30

60

90

120

150

Numof

Dokcers

Figure 5 Zion runtime startup time Time in seconds that aDocker with the Zion Java runtime takes to start

64 Application characteristicsTable 1 shows information of our four Zion applications The LOCcolumn gives the number of lines of code required to execute thefunction The third column gives the Function size used for eachapplication From this table it can be seen that our functions arevery lightweight for the proposed applications

65 Results

Docker characteristics The Java runtime and then the functionsare run inside Docker containers in our prototype Consequentlythe first validation to do was to assess the impact of starting ourruntime within Docker containers If the startup time was too largeit would hamper the inline processing capabilities of Zion Ourexperiments however revealed that this is not the case

For this experiment we utilized a pre-compiled Docker imagewith our Java runtime integrated in it The experiment consisted oflaunching 1 000 containers at different rates in the compute clusterand measure the startup time and memory along with the amountof memory for keeping the runtime up and running over time

Fig 5 shows the results of this experiment As depicted startinga new Docker with the Java runtime takes between 085 minus 095seconds Regarding RAM consumption we got that each containerconsumes around 35MB of memory These results show how ourruntime takes 5X times less to start in comparison with the AWSLambdarsquos runtime which proves that Zion is lightweight enoughfor elastic inline processing As Zion utilizes a large pool of alreadystarted containers the start up time is typically negligible in manycases and only amounts to 09 secs if there are no free containersin the pool

129

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

001 0015 002 0025 003 0035 004

(a) Measured request times with Swift

001 0015 002 0025 003 0035 004

(b) Measured request times with Swift + Zion

Figure 6 Zion base overhead These boxplots show measuredtimes of requesting 10kB objects in Swift (a) and gettingthe same objects with a no-operation (nop) function troughZion (b)

Zion overhead As another basic experiment we studied the extraoverhead that Zion adds when running a function The overheadis the time needed by the interception middleware to redirect theobject to a compute node take an available container launch afunction inside the Docker and pass the object data through it Forthis experiment we made use of the function listed in Listing 1

This simple function iterates over the data without processingit For measuring the overhead we used a set of 10kB objects Wefirst launched 5 000 plain GET requests to measure the base timeneeded to complete a GET request Then we launched another5 000 GET requests for the same objects but in this case we set upthe function in Listing 1 to respond upon the onAfterGet triggerof the objects

The results are plotted in Fig 6 This figure shows that the Zionrsquosoverhead is 9 ms This time includes the penalty of 5 ms for theaddition of an internal hop to Swift (redirect the data through acompute node) plus 4 ms to determine whether and which functionto run inside a free Docker container in the pool when the requesttouches a compute node

Performance of collocated functions We evaluated here theperformance of our application functions For this evaluation weconducted a stress test for each combination of application objectsize and number of function workers As our storage nodes have 4CPU cores we ran each function on 1 2 3 and 4 function workersrespectively The Docker containers were set up as follows 1 CPUcore and 512MB of RAM per worker The results are depicted inFig 7-10

The first experiments were done collocating the containers inthe storage nodes without using the Zion computation layer Themain objective of this experiment was 1 To show how collocationlimits the number of TPS and 2 To show the performance of Zionrsquosfunctions using 1 CPU core per function worker As a side effectnotice that the latter makes it possible to faithfully generalize our

200 300 400 5000

500

1000

1500

2000

Respon

setime(m

s)

10 kB

1 worker 2 workers 3 workers 4 workers

200 300 400 5000

500

1000

1500

2000

100kB

40 60 80 100

TPS

0

500

1000

1500

Respon

setime(m

s)

1MB

5 10 15 20 25 30

TPS

0

1000

2000

3000

4000

10MB

Figure 7 Compression function performance

20 40 60 80 1000

500

1000

1500

2000

1 MB

5 10 15 20

Transactions per Second (TPS)

0

1000

2000

3000

Respon

setime(m

s)

10 MB

2 4 6 8 10

Transactions per Second (TPS)

0

2000

4000

6000

8000

100 MB

0 100 200 300 400 5000

500

1000

1500

2000

Respon

setime(m

s)

100 kB

1 worker 2 workers 3 workers 4 workers

Figure 8 Content-level access control function perfor-mance

results to any number of cores We empirically verified this for theimage resize use case As the cluster is made up of 6 storage nodesthe function with 1 worker was able to process around 9 images persecond for the 400kB image case and around 54 images per secondin the entire cluster For the rest of applications we utilized onlyone storage node The reason is that to tap into the performance ofthe entire cluster we would have to replicate 6 times each object toensure that a copy of it is available at every storage node In activestorage the total computation power for an object is proportionalto the number of replicas it has

Notice that in Fig 7- 10 the curves representing the responsetime start to increase when the resources of the Docker containerare overloaded ie when all the CPU cores are at the 100 of theircapacity These experiments also show how the object size is veryimportant In all experiments the higher the object size the lowerthe number of TPS a function can handle To wit for the content-level access control application (Fig 8) the function is unable tohandle more than 1 TPS for the 100MB object irrespective of thenumber of workers Concretely this task takes around 16 secondsto complete The same occurs with the image resizing function We

130

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

10 20 30 40 50 600

1000

2000

3000

4000

Respon

setime(m

s)

90 kB

1 worker 2 workers 3 workers 4 workers

10 20 30 400

500

1000

1500

2000

400 kB

5 10 15 20

Transactions per Second (TPS)

0

1000

2000

3000

4000

Respon

setime(m

s)

800 kB

2 4 6 8 10

Transactions per Second (TPS)

0

1

2

3

4times104

1200 kB

Figure 9 Image resizer function performance

100 150 200 250 3000

500

1000

1500

2000

Respon

setime(m

s)

10 kB

1 worker 2 workers 3 workers 4 workers

50 100 150 200 2500

500

1000

1500

2000

100 kB

20 40 60 80 100 120

Transactions per Second (TPS)

0

500

1000

1500

Respon

setime(m

s)

1 MB

5 10 15 20 25 30

Transactions per Second (TPS)

0

500

1000

1500

2000

10 MB

Figure 10 Signature verification function performance

saw that resizing an image of 1200kB takes around 236 secondsleading to lt 1 TPS

A function that needs more than 1 second to process an objectit can only handle in parallel as many objects as functions workersare running These examples show that function collocation at thestorage nodes would be in any case insufficient to absorb a burst oftransactions for more than one second Also they demonstrate thatit is almost impossible to predict resource consumption ahead oftime because resources depend on the object size Consequentlyresource management for collocated functions should be dynamicwhich is hard to achieve in practice instead of a simple resourcemanagement policy such as one single worker per CPU core

Performance of non-collocated functions As resource con-tention is not an issue with non-collocated functions what is keyhere is to verify that Zionrsquos storage-disaggregated functions areindeed scalable

In order to test how the Zionrsquos compute layer provides betterscalability than the storage nodes we extended the prior evaluationto non-collocated functions Our compute nodes have 24 cores eachone This means that is possible to start up to 24 function workers in

0 1 2 3 4 5 6 7 810

20

30

40

Max

TPS

90 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 85

10

15

20

25

400 kB

0 1 2 3 4 5 6 7 8

Workers

0

2

4

6

8

Max

TPS

800 kB

0 1 2 3 4 5 6 7 8

Workers

-1

0

1

2

3

1200 kB

Figure 11 Image resizer function scalability

0 1 2 3 4 5 6 7 8100

150

200

250

300

Max

TPS

10 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 850

100

150

200

250

100 kB

0 1 2 3 4 5 6 7 8

Workers

0

50

100

150

200

Max

TPS

1 MB

0 1 2 3 4 5 6 7 8

Workers

6

8

10

12

14

10 MB

Figure 12 Signature verification function scalability

each node However to make the comparative fair with the storagenodes we utilized between 1 to 8 function workers

As in the previous experiments we used 1 proxy and 1 storagenode in addition to 1 compute node which is enough to assessthe scalability of Zion The results are shown in Fig 13- 12 In thiscase we recorded the number of maximum transactions per second(Max TPS) that each worker was able to handle with and withoutcollocation

First we can see how in almost all cases with 1 to 4 functionworkers non-collocated functions can handle more transactionsthan the storage nodes This is due to the fact that the computenodes are more powerful than the storage nodes in terms of CPUTherefore the capacity of ingestion is slightly higher Second themaximum TPS that a storage node can handle is always limitedabove by the number of available cores That is spawning moreworkers has no benefit because they will contend for the availableresources This is the reason why the curve for collocated functions(gray line) flattens out beyond 4 workers in all use cases

However non-collocated functions (black line) can continue toprocess transactions by spawning more workers to respond to thedemand By a quick inspection of all the figures the scalability ofnon-collocated functions is practically linear with the number of

131

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

0 1 2 3 4 5 6 7 8200

400

600

800

1000

Max

TPS 10 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 80

200

400

600

800

100 kB

0 1 2 3 4 5 6 7 8

Workers

20

40

60

80

Max

TPS 1 MB

0 1 2 3 4 5 6 7 8

Workers

0

5

10

15

10 MB

Figure 13 Compression function scalability

0 1 2 3 4 5 6 7 80

200

400

600

Max

TPS

100 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 820

40

60

80

1 MB

0 1 2 3 4 5 6 7 8

Workers

0

5

10

Max

TPS

10 MB

0 1 2 3 4 5 6 7 8

Workers

0

1

2

3

100 MB

Figure 14 Content-level access control function scalability

workers This suggests that disaggregating storage and compute is apractical solution to scale out computation in cloud object stores

Interactive data queries Table 2 compares the execution timesfor the same query for Hadoop and Zion (Listing 2) The entries ofthis table were obtained by running 30X each configuration andthen averaging the results

Non-surprisingly the Hadoop configuration that ingests the datafrom the remote Swift service is the one that presents the highestmean execution time This occurred due to network contentionBoth clusters were geographically far apart so communicationwent through some FAST Ethernet links within our institutionrsquosLAN More concretely it took asymp 917 seconds to transfer the entire10GB dataset and asymp 86 seconds to transfer the 1GB dataset

The configuration that read the data from HDFS had better timesspecially for the large datasets However it has the disadvantagethat it requires pre-loading all the data in HDFS which may be notpractical in many cases

For Zion functions the execution times were comparativelysmall despite using at most 24 workers This suggests that bettertimes can be easily attained with more workers This is clearly seenwhen comparing Zion with 12 and 24 workers respectively For

Table 2 Interactive data queries execution times

Configuration File size Chunk size Time

Pig query - Swift 100 MB 10 MB 816sPig query - HDFS 100 MB 10 MB 714sZion - 12 Workers 100 MB 10 MB 0510sZion - 24 Workers 100 GB 10 MB 0348s

Pig query - Swift 1 GB 32 MB 156sPig query - HDFS 1 GB 32 MB 754sZion - 12 Workers 1 GB 32 MB 4183sZion - 24 Workers 1 GB 32 MB 2256s

Pig query - Swift 10 GB 200 MB 985sPig query - HDFS 10 GB 200 MB 946sZion - 12 Workers 10 GB 200 MB 26253sZion - 24 Workers 10 GB 200 MB 13392s

the 10GB dataset and 12 workers it took 2X much more time thanwith 24 workers Also this experiment confirms the scalability ofZion without incurring in resource contention in the storage clusterIf we had executed this interactive query with 24 workers in thestorage nodes the storage cluster CPU would had reached 100 inall storage nodes for 134 seconds leading to resource contention

Listing 2 Evaluated query in Hadoop and Swift + Zion clus-terss e l e c t u s e r _ i d count ( lowast ) t o t a l

where ( r e q _ t = GetContentResponse or r e q _ t = PutContentResponse )

and msg= Reques t done group by u s e r _ i d order by t o t a l DESC l imi t 10

7 CONCLUSIONThis paper presents Zion an innovative data-driven serverlesscomputing middleware for object storage Unlike commercial event-driven serverless models our data-driven functions intercept andoperate over objects as they are readwrite fromto the object storeSince Zion is serverless it overcomes the scalability and resourcecontention problems of active storage without the need to managea server or a runtime environment

By injecting computations in the data pipeline Zion is ideal foruse cases that require synchronous interactions with external usersExamples of these use cases include (but are not limited to) dynamiccontent generation interactive queries personalization contentverification and access control In many of them the data localityof our inline computations contributes to optimize latency and toreduce data transfers outside the data center

ACKNOWLEDGMENTSThis work has been partially funded by the European Union Horizon2020 Framework Programme in the context of the project IOStackSoftware-defined Storage for Big Data (H2020-644182) and by theSpanish Ministry of Science and Innovation (Grant TIN2016-77836-C2-1-R)

132

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

REFERENCES[1] Amazon 2017 Athena httpsawsamazoncomathena (2017)[2] Amazon 2017 Lambda httpsawsamazoncomlambda (2017)[3] Amazon 2017 Redshift Spectrum httpsawsamazoncomredshiftspectrum

(2017)[4] Apache 2012 HBASE Coprocessors httpsblogsapacheorghbaseentry

coprocessor_introduction (2012)[5] Apache 2017 Hadoop OpenStack Support Swift Object Store httpshadoop

apacheorgdocsstablehadoop-openstackindexhtml (2017)[6] Krste Asanovic and D Patterson 2014 Firebox A hardware building block for

2020 warehouse-scale computers In FAST[7] Prosunjit Biswas Farhan Patwa and Ravi Sandhu 2015 Content Level Access

Control for OpenStack Swift Storage In ACM CODASPY 123ndash126[8] Chao Chen Yong Chen and Philip C Roth 2012 Dosas Mitigating the resource

contention in active storage systems In CLUSTER 164ndash172[9] Facebook 2017 Presto Distributed SQL Engine for Big Data httpsprestodbio

(2017)[10] Sadjad Fouladi Riad S Wahby Brennan Shacklett Karthikeyan Vasuki Balasubra-

maniam William Zeng Rahul Bhalerao Anirudh Sivaraman George Porter andKeith Winstein 2017 Encoding Fast and Slow Low-Latency Video ProcessingUsing Thousands of Tiny Threads In NSDI

[11] Peter X Gao Akshay Narayan Sagar Karandikar Joao Carreira Sangjin HanRachit Agarwal Sylvia Ratnasamy and Scott Shenker 2016 Network require-ments for resource disaggregation In OSDI 249ndash264

[12] Google 2009 BigTable Coprocessors httpswwwcscornelleduprojectsladis2009talksdean-keynote-ladis2009pdf (2009)

[13] Google 2017 Cloud Functions httpscloudgooglecomfunctionsdocs (2017)[14] Rauacutel Gracia-Tinedo Pedro Garciacutea-Loacutepez Marc Saacutenchez-Artigas Josep Sampeacute

Yosef Moatti Eran Rom Dalit Naor Ramon Nou Toni Corteacutes Pietro Michiardiand William Oppermann 2016 IOSTACK Software-defined Object StorageIEEE Internet Computing (2016)

[15] Rauacutel Gracia-Tinedo Josep Sampeacute Edgar Zamora Marc Saacutenchez-Artigas PedroGarciacutea-Loacutepez Yosef Moatti and Eran Rom 2017 Crystal Software-DefinedStorage for Multi-Tenant Object Stores In FAST

[16] Rauacutel Gracia-Tinedo Yongchao Tian Josep Sampeacute Hamza Harkous John LentonPedro Garciacutea-Loacutepez Marc Saacutenchez-Artigas and Marko Vukolic 2015 DissectingUbuntuOne Autopsy of a Global-scale Personal Cloud Back-end In IMC 155ndash168

[17] Scott Hendrickson Stephen Sturdevant Tyler Harter VenkateshwaranVenkataramani Andrea C Arpaci-Dusseau and Remzi H Arpaci-Dusseau 2016Serverless computation with OpenLambda In HotCloud

[18] HP 2017 The Machine Our vision for the Future of Computing httpswwwlabshpecomthe-machine (2017)

[19] IBM 2017 OpenWhisk httpsdeveloperibmcomopenwhisk (2017)[20] Eric Jonas Shivaram Venkataraman Ion Stoica and Benjamin Recht 2017

Occupy the Cloud Distributed computing for the 99 arXiv preprintarXiv170204024 (2017)

[21] Microsoft 2017 Azure Functions httpsazuremicrosoftcomen-usservicesfunctions (2017)

[22] Netflix 2016 Evolution of the Netflix Data Pipeline httptechblognetflixcom201602evolution-of-netflix-data-pipelinehtml (2016)

[23] OpenStack 2017 Nova httpsdocsopenstackorgnova (2017)[24] OpenStack 2017 Swift httpsdocsopenstackorgswift (2017)[25] OpenStack 2017 Swift Large Objects httpsdocsopenstackorgswiftlatest

overview_large_objectshtml (2017)[26] Juan Piernas Jarek Nieplocha and Evan J Felix 2007 Evaluation of active storage

strategies for the lustre parallel file system In SC 28[27] Lingjun Qin and Dan Feng 2006 Active Storage Framework for Object-based

Storage Device In AINA 97ndash101[28] Simona Rabinovici-Cohen Ealan Henis John Marberg and Kenneth Nagin 2014

Storlet engine performing computations in cloud storage Technical Report IBMTechnical Report H-0320 (August 2014)

[29] Paul Rad Van Lindberg Jeff Prevost Weining Zhang and Mo Jamshidi 2014ZeroVM secure distributed processing for big data analytics In WAC 1ndash6

[30] Redis 2017 Redis httpsredisio (2017)[31] Erik Riedel Garth A Gibson and Christos Faloutsos 1998 Active Storage for

Large-Scale Data Mining and Multimedia In VLDB 62ndash73[32] Michael T Runde Wesley G Stevens Paul A Wortman and John A Chandy 2012

An active storage framework for object storage devices In MSST 1ndash12[33] Josep Sampeacute 2017 Zion framework httpsgithubcomJosepSampezion

(2017)[34] Josep Sampeacute Pedro Garciacutea-Loacutepez and Marc Saacutenchez-Artigas 2016 Vertigo

Programmable micro-controllers for software-defined object storage In IEEECLOUD 180ndash187

[35] Seung Woo Son Samuel Lang Philip Carns Robert Ross Rajeev Thakur BerkinOzisikyilmaz Prabhat Kumar Wei-Keng Liao and Alok Choudhary 2010 En-abling active storage on parallel IO software stacks In MSST 1ndash12

[36] Zhipeng Tan Yanli Yuan Dan Feng Tian Zhan and Xudong Tu 2011 Imple-mentation of method object in active object-based storage systems In CCIS204ndash211

[37] Ubuntu 2017 Cloud image repository httpscloud-imagesubuntucom (2017)[38] Ubuntu 2017 Docker repository httpshubdockercomr_ubuntu (2017)[39] Irvine University of California 1996 Adult Data Set httparchiveicsuciedu

mldatasetsAdult (1996)[40] Himanshu Vashishtha and Eleni Stroulia 2011 Enhancing query support in

hbase via an extended coprocessors framework In ServiceWave 75ndash87[41] Rajiv Wickremesinghe Jeffrey S Chase and Jeffrey Scott Vitter 2002 Distributed

computing with load-managed active storage In HPDC 13ndash23[42] Yulai Xie Dan Feng Yan Li and Darrell DE Long 2016 Oasis an active storage

framework for object storage platform Future Generation Computer Systems 56(2016) 746ndash758

133

  • Abstract
  • 1 Introduction
  • 2 Related Work
  • 3 Design Overview
    • 31 Interception Software and Metadata Service
    • 32 Computation Layer
      • 4 Implementation
        • 41 Interception Software and Metadata Service
        • 42 Computation Layer
          • 5 Applications
          • 6 Evaluation
            • 61 System Setup
            • 62 Swift Resource Contention
            • 63 Application Setup
            • 64 Application characteristics
            • 65 Results
              • 7 Conclusion
              • Acknowledgments
              • References
Page 6: Data-Driven Serverless Functions for Object Storagefaculty.washington.edu/wlloyd/courses/tcss562... · using AWS Lambda for data-intensive tasks over disaggregated storage [10, 20]

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

Zion Service

Java Runtime

Java VM

Function 1

Function 1Worker 1

Java Runtime

Java VM

Function 1

Function 1Worker 2

Java Runtime

Java VM

Function 2

Function 2Worker 1

ToFrom Proxy TF Storage Node

Java Runtime

Docker waiting for a new function worker

Figure 2 Zion Compute Node High level architectureoverview of the components of a Zion Compute Node

of an object may have different data and metadata As functionsare stateless ie there is no relation between the different functioninvocations even if they occur in the same worker modifying thelocal copy of an objectrsquos metadata implies waiting for Swift toupdate all the replicas This behavior is clearly undesirable in termsof functionrsquos parallelism

To mitigate this issue each function worker is directly attachedto a distributed metadata caching system based on Redis It is aninternal feature totally transparent to the users who develop func-tions When a function is executed it initially loads the objectrsquosmetadata into the cache and then performs all the metadata modi-fications over it If there are other requests to the same object theother function invocations will use the previously-loaded metadataWhen the function terminates it offloads the metadata from thecache to all objectrsquos replicas

With this distributed cache we ensure that the objectsrsquo metadatatouched by the functions is always consistent in all replicas evenif there are multiple requests at the same time With it functionscan even do atomic operations over the metadata fields

5 APPLICATIONSHere we show some of the applications that Zion can support andthe ease with which these applications can be built on top of SwiftAs expected the described use cases are not arbitrary they havebeen chosen to show the potential of our data-driven model and inparticular for synchronous interactions with the object store Allof them have been implemented over Swift and evaluated in thenext section

Content-level Access Control With Zion it is extremely easyto implement sophisticated forms of access control based upon thecontents of the objects themselves This example is ideal to showthe limitations of AWS Lambda First it clear that access controlrequires the interception of the GET requests which can only bedone indirectly with the help of the API Gateway service Secondthe processing of the objectrsquos content to satisfy the access controlpolicy must be done inline as the user needs a timely notificationabout the status of her request which again cannot be realizedwith AWS Lambda Finally Zionrsquos design enables access controlto leverage data locality and improve response time as functions

go through the contents of the object as it is read from the storagenode

Typically access control in object storage operates at the gran-ularity of containers and hence once an object is accessible tosome party he gets the full content of the object Swift also followsthis ldquoall or nothingrdquo approach where the access to objects insidea container is enforced through access control lists (ACLs) whichmay be insufficient when objects contain sensitive content

In the exercise to show another capability of our framework weshow how content level access control can be realized very easilyin Swift thanks to our function abstraction in a similar fashion butcleaner than [7] By ldquocontent levelrdquo we mean that Swift users will beable to access to certain parts of an object based on their credentialsTo give a concrete example consider the publicly available Adultdataset from the UCI Machine Learning Repository [39] which con-tains about 48 000 rows of census information Each row containsattributes like race sex and marital-status which combinedwith explicit identifiers such as the SSN1 that identify the recordholders may leak sensitive information about a given individualAs a result the records of this object should be accessed differ-ently depending upon the user role For instance while a ldquopoliceagentrdquo should be able to access to all fields SSN age educationmarital-status race sex relationship capital-gain andnative-country a ldquocensus analystrdquo could be restricted to get onlya smaller view age education capital-gain native-country

To implement this example we have linked a function to theonAfterGet object trigger to enforce content level access controlon the object adult_datacsv We have defined a simple accesspolicy that depending on the use role ldquopolice agentrdquoor ldquocensusanalystrdquo allows to get all the fields or just an smaller projectionview This simple access policy has been stored as implicit parameterof the function ie in a JSON formatted string uploaded when welinked the object with the function as explained in 321 When aGET request comes for the object adult_datacsv the proxy firstchecks the Swift ACL If the object is accessible by that user thefunction then reads the content level policy and filters the dataonly if the user has the appropriate role

Compression A typical data reduction task is the compressionof objects In general any dynamic content filtering that involvesinline transformations of the content is ideally suited for our data-driven model A data-driven task could compressdecompress a filedynamically ldquoon-the-flyrdquo As Zion acts directly on the data pipelineie as the object is readwrite to the object store the result willbe either latency improvement or space reduction depending uponwhether Zion intercepts a GET request or a PUT request

Here we will merely consider the case of compressing incomingobjects ldquoon-the-flyrdquo which means that upon a GET request by theuser the target object will have to undergo decompression Thecommon approach for doing so in Swift is to implement an ad-hoccompression middleware However this approach is problematicFirst by running compression on the storage nodes compressionis repeated as many times are replicas there are By offloading it tothe proxies we can disrupt Swiftrsquos normal operation under heavy

1As the Adult dataset does not contain explicit identifiers we added a random SSN toeach row using the Fake Name Generator

126

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

load since the proxies are responsible for looking up the locationof objects in the rings and routing the requests accordingly

With Zion we can easily write down a compression functionand execute it between the proxies and the storage nodes in a scal-able way without worrying about resources or repeating the taskmany times In addition to the possibility to intercept GET requestsZion has another advantage over AWS Lambda It is capable to runcompression over objects whose total compression time exceed thefive minutesrsquo limit This is because Zion cancels a request only ifthe time for the receipt of the first byte from the function exceeds atimeout value (see Section 321 for details) This model is ideal foroperations such as compression that can be run as data is readwritewhich have been the focus of active storage for a long time

To implement the compression function we utilized gzip andthen we mapped it to a container with the onPut trigger As suchall writes to this container will be compressed We did the reverseprocess for the onAfterGet trigger so that when a user requestedan object of this container she would get the original uncompressedversion To do so we made use of an implicit parameter to tell thefunction what to do either to compress or decompress That is forthe onPut trigger the implicit parameter was set to ldquocompressionrdquoFor the onAfterGet trigger the implicit parameter value was setto ldquodecompressionrdquo

Image processing One of the archetypal use cases of serverlesscomputing is that of image resizing for we found it interesting toevaluate it here It is ideal to show the potentials of asynchronousevent-based functions such as AWS Lambdas and also very usefulfor tenants that use the object store as a back-end for web imagestorage In this case when the images are uploaded a function istriggered resizing and creating all the needed images for examplefor the different possible devices that can request the main web(phone tablet PC etc)

We did the same with Zion and coded a function that resizes animage to an arbitrary percentage To intercept the PUT requestswe linked it to an onPut trigger and specified jpg as the suffixof the object name So all objects whose name ended with jpgwere rescaled As Zion allows to create new objects as part of thefunctionrsquos output the function stores the original object and itsresized version(s)

Because of interception one interesting feature of Zion is that itdoes not require to fully store an image before the generation of itsresized version(s) as it occurs with AWS Lambda and S3 Since it isdone ldquoon-the-flyrdquo prior to storage Zion saves storage bandwidthAlthough of not much concern at first glance this property is veryinteresting for concurrently processing a vast collection of imagesas in [10] because IO savings add up rapidly for a large amount ofobjects

Signature verification To demonstrate the versatility of Zionwe proposed the signature verification use case Online contentchecking is again a perfect use case for data-driven functions sinceit requires an immediate response to the client

The main objective of this function is to verify that the objects(documents) that are uploaded are signed by the user and to verifythat the signature is valid ie the documents are authentic WithZion is possible to do this task in near-real time and notify the userinstantly in the case of rejection Also in the case of rejection we

prevent the archival of an inauthentic document thereby savingstorage space For the same reasons as above it is readily evidentthat this use case cannot be implemented with AWS Lambdas

The scenario is as follows The user obtains the public key fromher RSA key pair and she uploads it to a public container in theobject storage system Then the user signs the document with theprivate key and uploads it with the signature in a special header tothe object storage system which puts into execution the functionand verifies the document Note that in this case the function usesan explicit parameter (signature) described above

To do so we implemented a signature verification function Thefunction is mapped with a container to the onPut trigger Thereforeall object PUTs to this container will be enforced The function firstgets the public key from the object store based on the user who isuploading the object Then it loads the document content and itverifies the signature If it is valid the document is stored otherwisethe document is rejected sending an error message to the user

Interactive queries and result aggregation Finally interactivequeries is a use case that perfectly matches our data-driven modelWhen we want to perform fast data queries over existing datarepositories our data-driven model avoids moving the entire datasetto a computing cluster

For example object storage services are commonly used toarchive data like log files Businesses that want to extract fast in-sights from these data repositories using big data analytic toolsmust choose between two strategies 1 Moving data to the com-putation cluster to take advantage from data locality or 2 Usinga connector to allow data analytic tools to read data directly fromthe remote object storage With Zion we offer a third strategyCompute in the storage cluster using functions that filter data andaggregate results inline

The case example we present is a top-k query on access logsof UbuntuOne [16] a personal cloud service We want to obtaina list of the most active users and the total number of requestseach user sent The implementation of this query in Zion has twofunctions The first one receives a fragment of the log file and filtersrequests logs maintaining a counter for each user thus exploitingparallelism The other function receives as input the partial countersof the various instances of the first function and it performs theaggregation and sorting in order to produce the desired result Toimplement this mapreduce-like example we tapped into SwiftrsquosSLO-based data partitioning (see Section 41 for further details)

6 EVALUATIONFor confirming our suspicions we first studied how Swift behaveswhen some computation tasks are collocated in the same storagenode We then ran micro-benchmarks to measure the behavior andthe overheads of our functions We did so by running standaloneexperiments and the applications discussed in Section 5

61 System SetupOur experimental testbed consisted of a host (or client) with 2VCPUsand 4GB RAM On the server side we deployed Zion in an 8-machines rack OpenStack Swift (ocata version) installation formedby 2 proxy nodes Dell PowerEdge R320 with 12GB RAM and 6storage nodes Dell PowerEdge R320 with 8GB RAM (each one with

127

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

06

12

18

24

3

Respon

seTim

e(secon

ds) 1 Worker

2 Workers

3 Workers

4 Workers

(a) Transaction response time

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

20

40

60

80

100

CP

U U

sage

()

1 Worker

2 Workers

3 Workers

4 Workers

(b) Storage Node CPU Usage

Figure 3 Swift measurementMeasured response times (a) and CPU consumption for a given storage node (b) for getting objectsat a certain rate (transactions per second or TPS) for different values of Swift workers

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(a) 1 Object Server worker

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(b) 2 Object Server workers

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(c) 3 Object Server workers

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(d) 4 Object Server workers

Figure 4 Swift interference measurement These plots show how for a given storage node the swift object storage service behavesfor different number of workers and for different CPU loads of interference (Intf)

4 CPU cores) At the compute side the computation layer is com-posed by 3 nodes Dell PowerEdge R430 with 32GB RAM (each onewith 24 CPU cores) All the rack machines ran Ubuntu Server 1404The client host ran Ubuntu 16041 CloudImage [37]

62 Swift Resource ContentionWe first studied how collocation of compute and data affects SwiftTo do so in ldquoidealrdquo conditions we restricted this experiment to asingle storage node2 For this measurement we used a single proxynode Dell PowerEdge R320 with 12GB RAM and 1 storage nodeDell PowerEdge R320 with 8GB RAM

Base Swift The first experiment consisted of evaluating how Swiftnormally works (Fig 3) For this experiment we stored a bunchof random 10kB files into Swift Next using the unix httperftool we ran distinct workloads each one differing in the numberof transactions per second (TPS) We measured the resultant per-transaction response time (Fig 3a) and CPU usage (Fig 3b) of thestorage node

Technically Swift uses workers to handle a workload In Swift aworker is nothing but a thread that accepts requests Each worker

2 We note that the results are easy to extrapolate to larger Swift deployments

normally accepts 1 024 concurrent requests but it is a config-urable parameter This simple experiment confirmed us that whena worker exhausts the 100 of its corersquos resources the responsetime steeply increases due to the queuing delays For instance with1 worker the corersquos usage reaches 100 around 380 TPS the pointbeyond which the requests start to accumulate as shown in Fig3a This effect can be alleviated by starting new Swift workers asseen in this figure However since storage nodes have 4 cores onlySwift is limited to 4 workers And therefore a storage node willeventually end up experiencing resource contention sooner or latereven if there are no computations collocated with the data

Collocated Computation We repeated the same experiment butcollocating computations with the storage service The results inFig 4 show how the Swiftrsquos processing capacity diminishes as afunction of the CPU usage borrowed from the collocated tasks Forexample if Swift was restricted to use only 1 worker for requestprocessing collocated tasks would have almost no impact on SwiftHowever as the number of Swift workers increase to match thenumber of CPU cores the resource contention begins to producenegative effects in the storage system due to the interference fromthe collocated computations

128

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

Swiftrsquos best practices recommend to use as many workers asCPU cores Under a heavy use of the CPU by the collocated tasksthis figure shows that Swift request processing would be severelydiminished For instance when the collocated tasks consumed 80of the total CPU Swift processing fell to a 35 in this experimentThis result suggests that the right path to go is to move computationsoutside the storage nodes which is what we did with Zion

63 Application SetupNow we describe the specific evaluation setup for our example usecases As object stores can contain a huge diversity of object typesand sizes [16] we set up heterogeneous workloads for all use casesin order to verify the behavior of Zion under different scenarios

Content-level access control For testing this application weused the dataset described in Section 5 but trimmed down to differ-ent object sizes 100kB 1MB 10MB and 100MB As stated beforethe dataset content is filtered out according to the type of userthat requested the object To this aim we used Swift user roles toreturn only specific fields to each user The function associated tothe onAfterGet trigger reads from an implicit parameter the JSONstring containing the allowed fields for a specific userrsquos role (egage education marital-status) and then returns them to theuser

Compression It is well-known that the compression ratio of ob-jects affects resource consumption Objects that contain only zeroswill be compressed more quickly and consume less resources thancompressing a binary object To get rid of this issue we chose touse text documents with a similar compression ratio of around60 The documents were of sizes 10kB 100kB 1MB and 10MBrespectively

Image processing As in the previous case we tried to find themost common scenario for testing this function We focused onthose image sizes commonly used in static web pages Finally weused different jpg files of 90kB 400kB 800kB and 1200 kB andwe set an implicit parameter so that the function resizes the imageto the 50 of its original size

Signature verification A sample usage of this function may be toverify official documents and forms in a public institution For thisexperiment we used text documents also of different sizes 10 kB100 kB 1 MB and 10 MB These documents are signed with a RSAprivate key The experiment operates on PUT requests verifyingthe correction of the signature

Interactive data queries and result aggregation For this usecase we used different sizes of the publicly available UbuntuOnersquoslog file [16] 100MB 1GB and 10GB respectively We comparedZionrsquos functions execution time to those obtained using HadoopWe built a 10-node Hadoop cluster of commodity workstations4-core i5 at 253 GHZ and 16 GB of RAM For this application weissued a Pig query against 1 The log files stored in HDFS and 2Swift using the Apache Hadoop-Swift connector [5] For Zion first-stage filtering functions we picked chunk sizes of 10MB 32MBand 200MB for the 100MB 1GB and 10GB files respectively

Table 1 Function information

Application LOC Function Size

Content-level Access Control asymp 29 27 kBCompression asymp 8 18 kBImage processing asymp 17 23 kBSignature verification asymp 43 29 kBInteractive data query asymp 203 6 kB

08 085 09 095 1

Startup Time (seconds)

0

30

60

90

120

150

Numof

Dokcers

Figure 5 Zion runtime startup time Time in seconds that aDocker with the Zion Java runtime takes to start

64 Application characteristicsTable 1 shows information of our four Zion applications The LOCcolumn gives the number of lines of code required to execute thefunction The third column gives the Function size used for eachapplication From this table it can be seen that our functions arevery lightweight for the proposed applications

65 Results

Docker characteristics The Java runtime and then the functionsare run inside Docker containers in our prototype Consequentlythe first validation to do was to assess the impact of starting ourruntime within Docker containers If the startup time was too largeit would hamper the inline processing capabilities of Zion Ourexperiments however revealed that this is not the case

For this experiment we utilized a pre-compiled Docker imagewith our Java runtime integrated in it The experiment consisted oflaunching 1 000 containers at different rates in the compute clusterand measure the startup time and memory along with the amountof memory for keeping the runtime up and running over time

Fig 5 shows the results of this experiment As depicted startinga new Docker with the Java runtime takes between 085 minus 095seconds Regarding RAM consumption we got that each containerconsumes around 35MB of memory These results show how ourruntime takes 5X times less to start in comparison with the AWSLambdarsquos runtime which proves that Zion is lightweight enoughfor elastic inline processing As Zion utilizes a large pool of alreadystarted containers the start up time is typically negligible in manycases and only amounts to 09 secs if there are no free containersin the pool

129

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

001 0015 002 0025 003 0035 004

(a) Measured request times with Swift

001 0015 002 0025 003 0035 004

(b) Measured request times with Swift + Zion

Figure 6 Zion base overhead These boxplots show measuredtimes of requesting 10kB objects in Swift (a) and gettingthe same objects with a no-operation (nop) function troughZion (b)

Zion overhead As another basic experiment we studied the extraoverhead that Zion adds when running a function The overheadis the time needed by the interception middleware to redirect theobject to a compute node take an available container launch afunction inside the Docker and pass the object data through it Forthis experiment we made use of the function listed in Listing 1

This simple function iterates over the data without processingit For measuring the overhead we used a set of 10kB objects Wefirst launched 5 000 plain GET requests to measure the base timeneeded to complete a GET request Then we launched another5 000 GET requests for the same objects but in this case we set upthe function in Listing 1 to respond upon the onAfterGet triggerof the objects

The results are plotted in Fig 6 This figure shows that the Zionrsquosoverhead is 9 ms This time includes the penalty of 5 ms for theaddition of an internal hop to Swift (redirect the data through acompute node) plus 4 ms to determine whether and which functionto run inside a free Docker container in the pool when the requesttouches a compute node

Performance of collocated functions We evaluated here theperformance of our application functions For this evaluation weconducted a stress test for each combination of application objectsize and number of function workers As our storage nodes have 4CPU cores we ran each function on 1 2 3 and 4 function workersrespectively The Docker containers were set up as follows 1 CPUcore and 512MB of RAM per worker The results are depicted inFig 7-10

The first experiments were done collocating the containers inthe storage nodes without using the Zion computation layer Themain objective of this experiment was 1 To show how collocationlimits the number of TPS and 2 To show the performance of Zionrsquosfunctions using 1 CPU core per function worker As a side effectnotice that the latter makes it possible to faithfully generalize our

200 300 400 5000

500

1000

1500

2000

Respon

setime(m

s)

10 kB

1 worker 2 workers 3 workers 4 workers

200 300 400 5000

500

1000

1500

2000

100kB

40 60 80 100

TPS

0

500

1000

1500

Respon

setime(m

s)

1MB

5 10 15 20 25 30

TPS

0

1000

2000

3000

4000

10MB

Figure 7 Compression function performance

20 40 60 80 1000

500

1000

1500

2000

1 MB

5 10 15 20

Transactions per Second (TPS)

0

1000

2000

3000

Respon

setime(m

s)

10 MB

2 4 6 8 10

Transactions per Second (TPS)

0

2000

4000

6000

8000

100 MB

0 100 200 300 400 5000

500

1000

1500

2000

Respon

setime(m

s)

100 kB

1 worker 2 workers 3 workers 4 workers

Figure 8 Content-level access control function perfor-mance

results to any number of cores We empirically verified this for theimage resize use case As the cluster is made up of 6 storage nodesthe function with 1 worker was able to process around 9 images persecond for the 400kB image case and around 54 images per secondin the entire cluster For the rest of applications we utilized onlyone storage node The reason is that to tap into the performance ofthe entire cluster we would have to replicate 6 times each object toensure that a copy of it is available at every storage node In activestorage the total computation power for an object is proportionalto the number of replicas it has

Notice that in Fig 7- 10 the curves representing the responsetime start to increase when the resources of the Docker containerare overloaded ie when all the CPU cores are at the 100 of theircapacity These experiments also show how the object size is veryimportant In all experiments the higher the object size the lowerthe number of TPS a function can handle To wit for the content-level access control application (Fig 8) the function is unable tohandle more than 1 TPS for the 100MB object irrespective of thenumber of workers Concretely this task takes around 16 secondsto complete The same occurs with the image resizing function We

130

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

10 20 30 40 50 600

1000

2000

3000

4000

Respon

setime(m

s)

90 kB

1 worker 2 workers 3 workers 4 workers

10 20 30 400

500

1000

1500

2000

400 kB

5 10 15 20

Transactions per Second (TPS)

0

1000

2000

3000

4000

Respon

setime(m

s)

800 kB

2 4 6 8 10

Transactions per Second (TPS)

0

1

2

3

4times104

1200 kB

Figure 9 Image resizer function performance

100 150 200 250 3000

500

1000

1500

2000

Respon

setime(m

s)

10 kB

1 worker 2 workers 3 workers 4 workers

50 100 150 200 2500

500

1000

1500

2000

100 kB

20 40 60 80 100 120

Transactions per Second (TPS)

0

500

1000

1500

Respon

setime(m

s)

1 MB

5 10 15 20 25 30

Transactions per Second (TPS)

0

500

1000

1500

2000

10 MB

Figure 10 Signature verification function performance

saw that resizing an image of 1200kB takes around 236 secondsleading to lt 1 TPS

A function that needs more than 1 second to process an objectit can only handle in parallel as many objects as functions workersare running These examples show that function collocation at thestorage nodes would be in any case insufficient to absorb a burst oftransactions for more than one second Also they demonstrate thatit is almost impossible to predict resource consumption ahead oftime because resources depend on the object size Consequentlyresource management for collocated functions should be dynamicwhich is hard to achieve in practice instead of a simple resourcemanagement policy such as one single worker per CPU core

Performance of non-collocated functions As resource con-tention is not an issue with non-collocated functions what is keyhere is to verify that Zionrsquos storage-disaggregated functions areindeed scalable

In order to test how the Zionrsquos compute layer provides betterscalability than the storage nodes we extended the prior evaluationto non-collocated functions Our compute nodes have 24 cores eachone This means that is possible to start up to 24 function workers in

0 1 2 3 4 5 6 7 810

20

30

40

Max

TPS

90 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 85

10

15

20

25

400 kB

0 1 2 3 4 5 6 7 8

Workers

0

2

4

6

8

Max

TPS

800 kB

0 1 2 3 4 5 6 7 8

Workers

-1

0

1

2

3

1200 kB

Figure 11 Image resizer function scalability

0 1 2 3 4 5 6 7 8100

150

200

250

300

Max

TPS

10 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 850

100

150

200

250

100 kB

0 1 2 3 4 5 6 7 8

Workers

0

50

100

150

200

Max

TPS

1 MB

0 1 2 3 4 5 6 7 8

Workers

6

8

10

12

14

10 MB

Figure 12 Signature verification function scalability

each node However to make the comparative fair with the storagenodes we utilized between 1 to 8 function workers

As in the previous experiments we used 1 proxy and 1 storagenode in addition to 1 compute node which is enough to assessthe scalability of Zion The results are shown in Fig 13- 12 In thiscase we recorded the number of maximum transactions per second(Max TPS) that each worker was able to handle with and withoutcollocation

First we can see how in almost all cases with 1 to 4 functionworkers non-collocated functions can handle more transactionsthan the storage nodes This is due to the fact that the computenodes are more powerful than the storage nodes in terms of CPUTherefore the capacity of ingestion is slightly higher Second themaximum TPS that a storage node can handle is always limitedabove by the number of available cores That is spawning moreworkers has no benefit because they will contend for the availableresources This is the reason why the curve for collocated functions(gray line) flattens out beyond 4 workers in all use cases

However non-collocated functions (black line) can continue toprocess transactions by spawning more workers to respond to thedemand By a quick inspection of all the figures the scalability ofnon-collocated functions is practically linear with the number of

131

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

0 1 2 3 4 5 6 7 8200

400

600

800

1000

Max

TPS 10 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 80

200

400

600

800

100 kB

0 1 2 3 4 5 6 7 8

Workers

20

40

60

80

Max

TPS 1 MB

0 1 2 3 4 5 6 7 8

Workers

0

5

10

15

10 MB

Figure 13 Compression function scalability

0 1 2 3 4 5 6 7 80

200

400

600

Max

TPS

100 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 820

40

60

80

1 MB

0 1 2 3 4 5 6 7 8

Workers

0

5

10

Max

TPS

10 MB

0 1 2 3 4 5 6 7 8

Workers

0

1

2

3

100 MB

Figure 14 Content-level access control function scalability

workers This suggests that disaggregating storage and compute is apractical solution to scale out computation in cloud object stores

Interactive data queries Table 2 compares the execution timesfor the same query for Hadoop and Zion (Listing 2) The entries ofthis table were obtained by running 30X each configuration andthen averaging the results

Non-surprisingly the Hadoop configuration that ingests the datafrom the remote Swift service is the one that presents the highestmean execution time This occurred due to network contentionBoth clusters were geographically far apart so communicationwent through some FAST Ethernet links within our institutionrsquosLAN More concretely it took asymp 917 seconds to transfer the entire10GB dataset and asymp 86 seconds to transfer the 1GB dataset

The configuration that read the data from HDFS had better timesspecially for the large datasets However it has the disadvantagethat it requires pre-loading all the data in HDFS which may be notpractical in many cases

For Zion functions the execution times were comparativelysmall despite using at most 24 workers This suggests that bettertimes can be easily attained with more workers This is clearly seenwhen comparing Zion with 12 and 24 workers respectively For

Table 2 Interactive data queries execution times

Configuration File size Chunk size Time

Pig query - Swift 100 MB 10 MB 816sPig query - HDFS 100 MB 10 MB 714sZion - 12 Workers 100 MB 10 MB 0510sZion - 24 Workers 100 GB 10 MB 0348s

Pig query - Swift 1 GB 32 MB 156sPig query - HDFS 1 GB 32 MB 754sZion - 12 Workers 1 GB 32 MB 4183sZion - 24 Workers 1 GB 32 MB 2256s

Pig query - Swift 10 GB 200 MB 985sPig query - HDFS 10 GB 200 MB 946sZion - 12 Workers 10 GB 200 MB 26253sZion - 24 Workers 10 GB 200 MB 13392s

the 10GB dataset and 12 workers it took 2X much more time thanwith 24 workers Also this experiment confirms the scalability ofZion without incurring in resource contention in the storage clusterIf we had executed this interactive query with 24 workers in thestorage nodes the storage cluster CPU would had reached 100 inall storage nodes for 134 seconds leading to resource contention

Listing 2 Evaluated query in Hadoop and Swift + Zion clus-terss e l e c t u s e r _ i d count ( lowast ) t o t a l

where ( r e q _ t = GetContentResponse or r e q _ t = PutContentResponse )

and msg= Reques t done group by u s e r _ i d order by t o t a l DESC l imi t 10

7 CONCLUSIONThis paper presents Zion an innovative data-driven serverlesscomputing middleware for object storage Unlike commercial event-driven serverless models our data-driven functions intercept andoperate over objects as they are readwrite fromto the object storeSince Zion is serverless it overcomes the scalability and resourcecontention problems of active storage without the need to managea server or a runtime environment

By injecting computations in the data pipeline Zion is ideal foruse cases that require synchronous interactions with external usersExamples of these use cases include (but are not limited to) dynamiccontent generation interactive queries personalization contentverification and access control In many of them the data localityof our inline computations contributes to optimize latency and toreduce data transfers outside the data center

ACKNOWLEDGMENTSThis work has been partially funded by the European Union Horizon2020 Framework Programme in the context of the project IOStackSoftware-defined Storage for Big Data (H2020-644182) and by theSpanish Ministry of Science and Innovation (Grant TIN2016-77836-C2-1-R)

132

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

REFERENCES[1] Amazon 2017 Athena httpsawsamazoncomathena (2017)[2] Amazon 2017 Lambda httpsawsamazoncomlambda (2017)[3] Amazon 2017 Redshift Spectrum httpsawsamazoncomredshiftspectrum

(2017)[4] Apache 2012 HBASE Coprocessors httpsblogsapacheorghbaseentry

coprocessor_introduction (2012)[5] Apache 2017 Hadoop OpenStack Support Swift Object Store httpshadoop

apacheorgdocsstablehadoop-openstackindexhtml (2017)[6] Krste Asanovic and D Patterson 2014 Firebox A hardware building block for

2020 warehouse-scale computers In FAST[7] Prosunjit Biswas Farhan Patwa and Ravi Sandhu 2015 Content Level Access

Control for OpenStack Swift Storage In ACM CODASPY 123ndash126[8] Chao Chen Yong Chen and Philip C Roth 2012 Dosas Mitigating the resource

contention in active storage systems In CLUSTER 164ndash172[9] Facebook 2017 Presto Distributed SQL Engine for Big Data httpsprestodbio

(2017)[10] Sadjad Fouladi Riad S Wahby Brennan Shacklett Karthikeyan Vasuki Balasubra-

maniam William Zeng Rahul Bhalerao Anirudh Sivaraman George Porter andKeith Winstein 2017 Encoding Fast and Slow Low-Latency Video ProcessingUsing Thousands of Tiny Threads In NSDI

[11] Peter X Gao Akshay Narayan Sagar Karandikar Joao Carreira Sangjin HanRachit Agarwal Sylvia Ratnasamy and Scott Shenker 2016 Network require-ments for resource disaggregation In OSDI 249ndash264

[12] Google 2009 BigTable Coprocessors httpswwwcscornelleduprojectsladis2009talksdean-keynote-ladis2009pdf (2009)

[13] Google 2017 Cloud Functions httpscloudgooglecomfunctionsdocs (2017)[14] Rauacutel Gracia-Tinedo Pedro Garciacutea-Loacutepez Marc Saacutenchez-Artigas Josep Sampeacute

Yosef Moatti Eran Rom Dalit Naor Ramon Nou Toni Corteacutes Pietro Michiardiand William Oppermann 2016 IOSTACK Software-defined Object StorageIEEE Internet Computing (2016)

[15] Rauacutel Gracia-Tinedo Josep Sampeacute Edgar Zamora Marc Saacutenchez-Artigas PedroGarciacutea-Loacutepez Yosef Moatti and Eran Rom 2017 Crystal Software-DefinedStorage for Multi-Tenant Object Stores In FAST

[16] Rauacutel Gracia-Tinedo Yongchao Tian Josep Sampeacute Hamza Harkous John LentonPedro Garciacutea-Loacutepez Marc Saacutenchez-Artigas and Marko Vukolic 2015 DissectingUbuntuOne Autopsy of a Global-scale Personal Cloud Back-end In IMC 155ndash168

[17] Scott Hendrickson Stephen Sturdevant Tyler Harter VenkateshwaranVenkataramani Andrea C Arpaci-Dusseau and Remzi H Arpaci-Dusseau 2016Serverless computation with OpenLambda In HotCloud

[18] HP 2017 The Machine Our vision for the Future of Computing httpswwwlabshpecomthe-machine (2017)

[19] IBM 2017 OpenWhisk httpsdeveloperibmcomopenwhisk (2017)[20] Eric Jonas Shivaram Venkataraman Ion Stoica and Benjamin Recht 2017

Occupy the Cloud Distributed computing for the 99 arXiv preprintarXiv170204024 (2017)

[21] Microsoft 2017 Azure Functions httpsazuremicrosoftcomen-usservicesfunctions (2017)

[22] Netflix 2016 Evolution of the Netflix Data Pipeline httptechblognetflixcom201602evolution-of-netflix-data-pipelinehtml (2016)

[23] OpenStack 2017 Nova httpsdocsopenstackorgnova (2017)[24] OpenStack 2017 Swift httpsdocsopenstackorgswift (2017)[25] OpenStack 2017 Swift Large Objects httpsdocsopenstackorgswiftlatest

overview_large_objectshtml (2017)[26] Juan Piernas Jarek Nieplocha and Evan J Felix 2007 Evaluation of active storage

strategies for the lustre parallel file system In SC 28[27] Lingjun Qin and Dan Feng 2006 Active Storage Framework for Object-based

Storage Device In AINA 97ndash101[28] Simona Rabinovici-Cohen Ealan Henis John Marberg and Kenneth Nagin 2014

Storlet engine performing computations in cloud storage Technical Report IBMTechnical Report H-0320 (August 2014)

[29] Paul Rad Van Lindberg Jeff Prevost Weining Zhang and Mo Jamshidi 2014ZeroVM secure distributed processing for big data analytics In WAC 1ndash6

[30] Redis 2017 Redis httpsredisio (2017)[31] Erik Riedel Garth A Gibson and Christos Faloutsos 1998 Active Storage for

Large-Scale Data Mining and Multimedia In VLDB 62ndash73[32] Michael T Runde Wesley G Stevens Paul A Wortman and John A Chandy 2012

An active storage framework for object storage devices In MSST 1ndash12[33] Josep Sampeacute 2017 Zion framework httpsgithubcomJosepSampezion

(2017)[34] Josep Sampeacute Pedro Garciacutea-Loacutepez and Marc Saacutenchez-Artigas 2016 Vertigo

Programmable micro-controllers for software-defined object storage In IEEECLOUD 180ndash187

[35] Seung Woo Son Samuel Lang Philip Carns Robert Ross Rajeev Thakur BerkinOzisikyilmaz Prabhat Kumar Wei-Keng Liao and Alok Choudhary 2010 En-abling active storage on parallel IO software stacks In MSST 1ndash12

[36] Zhipeng Tan Yanli Yuan Dan Feng Tian Zhan and Xudong Tu 2011 Imple-mentation of method object in active object-based storage systems In CCIS204ndash211

[37] Ubuntu 2017 Cloud image repository httpscloud-imagesubuntucom (2017)[38] Ubuntu 2017 Docker repository httpshubdockercomr_ubuntu (2017)[39] Irvine University of California 1996 Adult Data Set httparchiveicsuciedu

mldatasetsAdult (1996)[40] Himanshu Vashishtha and Eleni Stroulia 2011 Enhancing query support in

hbase via an extended coprocessors framework In ServiceWave 75ndash87[41] Rajiv Wickremesinghe Jeffrey S Chase and Jeffrey Scott Vitter 2002 Distributed

computing with load-managed active storage In HPDC 13ndash23[42] Yulai Xie Dan Feng Yan Li and Darrell DE Long 2016 Oasis an active storage

framework for object storage platform Future Generation Computer Systems 56(2016) 746ndash758

133

  • Abstract
  • 1 Introduction
  • 2 Related Work
  • 3 Design Overview
    • 31 Interception Software and Metadata Service
    • 32 Computation Layer
      • 4 Implementation
        • 41 Interception Software and Metadata Service
        • 42 Computation Layer
          • 5 Applications
          • 6 Evaluation
            • 61 System Setup
            • 62 Swift Resource Contention
            • 63 Application Setup
            • 64 Application characteristics
            • 65 Results
              • 7 Conclusion
              • Acknowledgments
              • References
Page 7: Data-Driven Serverless Functions for Object Storagefaculty.washington.edu/wlloyd/courses/tcss562... · using AWS Lambda for data-intensive tasks over disaggregated storage [10, 20]

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

load since the proxies are responsible for looking up the locationof objects in the rings and routing the requests accordingly

With Zion we can easily write down a compression functionand execute it between the proxies and the storage nodes in a scal-able way without worrying about resources or repeating the taskmany times In addition to the possibility to intercept GET requestsZion has another advantage over AWS Lambda It is capable to runcompression over objects whose total compression time exceed thefive minutesrsquo limit This is because Zion cancels a request only ifthe time for the receipt of the first byte from the function exceeds atimeout value (see Section 321 for details) This model is ideal foroperations such as compression that can be run as data is readwritewhich have been the focus of active storage for a long time

To implement the compression function we utilized gzip andthen we mapped it to a container with the onPut trigger As suchall writes to this container will be compressed We did the reverseprocess for the onAfterGet trigger so that when a user requestedan object of this container she would get the original uncompressedversion To do so we made use of an implicit parameter to tell thefunction what to do either to compress or decompress That is forthe onPut trigger the implicit parameter was set to ldquocompressionrdquoFor the onAfterGet trigger the implicit parameter value was setto ldquodecompressionrdquo

Image processing One of the archetypal use cases of serverlesscomputing is that of image resizing for we found it interesting toevaluate it here It is ideal to show the potentials of asynchronousevent-based functions such as AWS Lambdas and also very usefulfor tenants that use the object store as a back-end for web imagestorage In this case when the images are uploaded a function istriggered resizing and creating all the needed images for examplefor the different possible devices that can request the main web(phone tablet PC etc)

We did the same with Zion and coded a function that resizes animage to an arbitrary percentage To intercept the PUT requestswe linked it to an onPut trigger and specified jpg as the suffixof the object name So all objects whose name ended with jpgwere rescaled As Zion allows to create new objects as part of thefunctionrsquos output the function stores the original object and itsresized version(s)

Because of interception one interesting feature of Zion is that itdoes not require to fully store an image before the generation of itsresized version(s) as it occurs with AWS Lambda and S3 Since it isdone ldquoon-the-flyrdquo prior to storage Zion saves storage bandwidthAlthough of not much concern at first glance this property is veryinteresting for concurrently processing a vast collection of imagesas in [10] because IO savings add up rapidly for a large amount ofobjects

Signature verification To demonstrate the versatility of Zionwe proposed the signature verification use case Online contentchecking is again a perfect use case for data-driven functions sinceit requires an immediate response to the client

The main objective of this function is to verify that the objects(documents) that are uploaded are signed by the user and to verifythat the signature is valid ie the documents are authentic WithZion is possible to do this task in near-real time and notify the userinstantly in the case of rejection Also in the case of rejection we

prevent the archival of an inauthentic document thereby savingstorage space For the same reasons as above it is readily evidentthat this use case cannot be implemented with AWS Lambdas

The scenario is as follows The user obtains the public key fromher RSA key pair and she uploads it to a public container in theobject storage system Then the user signs the document with theprivate key and uploads it with the signature in a special header tothe object storage system which puts into execution the functionand verifies the document Note that in this case the function usesan explicit parameter (signature) described above

To do so we implemented a signature verification function Thefunction is mapped with a container to the onPut trigger Thereforeall object PUTs to this container will be enforced The function firstgets the public key from the object store based on the user who isuploading the object Then it loads the document content and itverifies the signature If it is valid the document is stored otherwisethe document is rejected sending an error message to the user

Interactive queries and result aggregation Finally interactivequeries is a use case that perfectly matches our data-driven modelWhen we want to perform fast data queries over existing datarepositories our data-driven model avoids moving the entire datasetto a computing cluster

For example object storage services are commonly used toarchive data like log files Businesses that want to extract fast in-sights from these data repositories using big data analytic toolsmust choose between two strategies 1 Moving data to the com-putation cluster to take advantage from data locality or 2 Usinga connector to allow data analytic tools to read data directly fromthe remote object storage With Zion we offer a third strategyCompute in the storage cluster using functions that filter data andaggregate results inline

The case example we present is a top-k query on access logsof UbuntuOne [16] a personal cloud service We want to obtaina list of the most active users and the total number of requestseach user sent The implementation of this query in Zion has twofunctions The first one receives a fragment of the log file and filtersrequests logs maintaining a counter for each user thus exploitingparallelism The other function receives as input the partial countersof the various instances of the first function and it performs theaggregation and sorting in order to produce the desired result Toimplement this mapreduce-like example we tapped into SwiftrsquosSLO-based data partitioning (see Section 41 for further details)

6 EVALUATIONFor confirming our suspicions we first studied how Swift behaveswhen some computation tasks are collocated in the same storagenode We then ran micro-benchmarks to measure the behavior andthe overheads of our functions We did so by running standaloneexperiments and the applications discussed in Section 5

61 System SetupOur experimental testbed consisted of a host (or client) with 2VCPUsand 4GB RAM On the server side we deployed Zion in an 8-machines rack OpenStack Swift (ocata version) installation formedby 2 proxy nodes Dell PowerEdge R320 with 12GB RAM and 6storage nodes Dell PowerEdge R320 with 8GB RAM (each one with

127

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

06

12

18

24

3

Respon

seTim

e(secon

ds) 1 Worker

2 Workers

3 Workers

4 Workers

(a) Transaction response time

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

20

40

60

80

100

CP

U U

sage

()

1 Worker

2 Workers

3 Workers

4 Workers

(b) Storage Node CPU Usage

Figure 3 Swift measurementMeasured response times (a) and CPU consumption for a given storage node (b) for getting objectsat a certain rate (transactions per second or TPS) for different values of Swift workers

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(a) 1 Object Server worker

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(b) 2 Object Server workers

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(c) 3 Object Server workers

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(d) 4 Object Server workers

Figure 4 Swift interference measurement These plots show how for a given storage node the swift object storage service behavesfor different number of workers and for different CPU loads of interference (Intf)

4 CPU cores) At the compute side the computation layer is com-posed by 3 nodes Dell PowerEdge R430 with 32GB RAM (each onewith 24 CPU cores) All the rack machines ran Ubuntu Server 1404The client host ran Ubuntu 16041 CloudImage [37]

62 Swift Resource ContentionWe first studied how collocation of compute and data affects SwiftTo do so in ldquoidealrdquo conditions we restricted this experiment to asingle storage node2 For this measurement we used a single proxynode Dell PowerEdge R320 with 12GB RAM and 1 storage nodeDell PowerEdge R320 with 8GB RAM

Base Swift The first experiment consisted of evaluating how Swiftnormally works (Fig 3) For this experiment we stored a bunchof random 10kB files into Swift Next using the unix httperftool we ran distinct workloads each one differing in the numberof transactions per second (TPS) We measured the resultant per-transaction response time (Fig 3a) and CPU usage (Fig 3b) of thestorage node

Technically Swift uses workers to handle a workload In Swift aworker is nothing but a thread that accepts requests Each worker

2 We note that the results are easy to extrapolate to larger Swift deployments

normally accepts 1 024 concurrent requests but it is a config-urable parameter This simple experiment confirmed us that whena worker exhausts the 100 of its corersquos resources the responsetime steeply increases due to the queuing delays For instance with1 worker the corersquos usage reaches 100 around 380 TPS the pointbeyond which the requests start to accumulate as shown in Fig3a This effect can be alleviated by starting new Swift workers asseen in this figure However since storage nodes have 4 cores onlySwift is limited to 4 workers And therefore a storage node willeventually end up experiencing resource contention sooner or latereven if there are no computations collocated with the data

Collocated Computation We repeated the same experiment butcollocating computations with the storage service The results inFig 4 show how the Swiftrsquos processing capacity diminishes as afunction of the CPU usage borrowed from the collocated tasks Forexample if Swift was restricted to use only 1 worker for requestprocessing collocated tasks would have almost no impact on SwiftHowever as the number of Swift workers increase to match thenumber of CPU cores the resource contention begins to producenegative effects in the storage system due to the interference fromthe collocated computations

128

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

Swiftrsquos best practices recommend to use as many workers asCPU cores Under a heavy use of the CPU by the collocated tasksthis figure shows that Swift request processing would be severelydiminished For instance when the collocated tasks consumed 80of the total CPU Swift processing fell to a 35 in this experimentThis result suggests that the right path to go is to move computationsoutside the storage nodes which is what we did with Zion

63 Application SetupNow we describe the specific evaluation setup for our example usecases As object stores can contain a huge diversity of object typesand sizes [16] we set up heterogeneous workloads for all use casesin order to verify the behavior of Zion under different scenarios

Content-level access control For testing this application weused the dataset described in Section 5 but trimmed down to differ-ent object sizes 100kB 1MB 10MB and 100MB As stated beforethe dataset content is filtered out according to the type of userthat requested the object To this aim we used Swift user roles toreturn only specific fields to each user The function associated tothe onAfterGet trigger reads from an implicit parameter the JSONstring containing the allowed fields for a specific userrsquos role (egage education marital-status) and then returns them to theuser

Compression It is well-known that the compression ratio of ob-jects affects resource consumption Objects that contain only zeroswill be compressed more quickly and consume less resources thancompressing a binary object To get rid of this issue we chose touse text documents with a similar compression ratio of around60 The documents were of sizes 10kB 100kB 1MB and 10MBrespectively

Image processing As in the previous case we tried to find themost common scenario for testing this function We focused onthose image sizes commonly used in static web pages Finally weused different jpg files of 90kB 400kB 800kB and 1200 kB andwe set an implicit parameter so that the function resizes the imageto the 50 of its original size

Signature verification A sample usage of this function may be toverify official documents and forms in a public institution For thisexperiment we used text documents also of different sizes 10 kB100 kB 1 MB and 10 MB These documents are signed with a RSAprivate key The experiment operates on PUT requests verifyingthe correction of the signature

Interactive data queries and result aggregation For this usecase we used different sizes of the publicly available UbuntuOnersquoslog file [16] 100MB 1GB and 10GB respectively We comparedZionrsquos functions execution time to those obtained using HadoopWe built a 10-node Hadoop cluster of commodity workstations4-core i5 at 253 GHZ and 16 GB of RAM For this application weissued a Pig query against 1 The log files stored in HDFS and 2Swift using the Apache Hadoop-Swift connector [5] For Zion first-stage filtering functions we picked chunk sizes of 10MB 32MBand 200MB for the 100MB 1GB and 10GB files respectively

Table 1 Function information

Application LOC Function Size

Content-level Access Control asymp 29 27 kBCompression asymp 8 18 kBImage processing asymp 17 23 kBSignature verification asymp 43 29 kBInteractive data query asymp 203 6 kB

08 085 09 095 1

Startup Time (seconds)

0

30

60

90

120

150

Numof

Dokcers

Figure 5 Zion runtime startup time Time in seconds that aDocker with the Zion Java runtime takes to start

64 Application characteristicsTable 1 shows information of our four Zion applications The LOCcolumn gives the number of lines of code required to execute thefunction The third column gives the Function size used for eachapplication From this table it can be seen that our functions arevery lightweight for the proposed applications

65 Results

Docker characteristics The Java runtime and then the functionsare run inside Docker containers in our prototype Consequentlythe first validation to do was to assess the impact of starting ourruntime within Docker containers If the startup time was too largeit would hamper the inline processing capabilities of Zion Ourexperiments however revealed that this is not the case

For this experiment we utilized a pre-compiled Docker imagewith our Java runtime integrated in it The experiment consisted oflaunching 1 000 containers at different rates in the compute clusterand measure the startup time and memory along with the amountof memory for keeping the runtime up and running over time

Fig 5 shows the results of this experiment As depicted startinga new Docker with the Java runtime takes between 085 minus 095seconds Regarding RAM consumption we got that each containerconsumes around 35MB of memory These results show how ourruntime takes 5X times less to start in comparison with the AWSLambdarsquos runtime which proves that Zion is lightweight enoughfor elastic inline processing As Zion utilizes a large pool of alreadystarted containers the start up time is typically negligible in manycases and only amounts to 09 secs if there are no free containersin the pool

129

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

001 0015 002 0025 003 0035 004

(a) Measured request times with Swift

001 0015 002 0025 003 0035 004

(b) Measured request times with Swift + Zion

Figure 6 Zion base overhead These boxplots show measuredtimes of requesting 10kB objects in Swift (a) and gettingthe same objects with a no-operation (nop) function troughZion (b)

Zion overhead As another basic experiment we studied the extraoverhead that Zion adds when running a function The overheadis the time needed by the interception middleware to redirect theobject to a compute node take an available container launch afunction inside the Docker and pass the object data through it Forthis experiment we made use of the function listed in Listing 1

This simple function iterates over the data without processingit For measuring the overhead we used a set of 10kB objects Wefirst launched 5 000 plain GET requests to measure the base timeneeded to complete a GET request Then we launched another5 000 GET requests for the same objects but in this case we set upthe function in Listing 1 to respond upon the onAfterGet triggerof the objects

The results are plotted in Fig 6 This figure shows that the Zionrsquosoverhead is 9 ms This time includes the penalty of 5 ms for theaddition of an internal hop to Swift (redirect the data through acompute node) plus 4 ms to determine whether and which functionto run inside a free Docker container in the pool when the requesttouches a compute node

Performance of collocated functions We evaluated here theperformance of our application functions For this evaluation weconducted a stress test for each combination of application objectsize and number of function workers As our storage nodes have 4CPU cores we ran each function on 1 2 3 and 4 function workersrespectively The Docker containers were set up as follows 1 CPUcore and 512MB of RAM per worker The results are depicted inFig 7-10

The first experiments were done collocating the containers inthe storage nodes without using the Zion computation layer Themain objective of this experiment was 1 To show how collocationlimits the number of TPS and 2 To show the performance of Zionrsquosfunctions using 1 CPU core per function worker As a side effectnotice that the latter makes it possible to faithfully generalize our

200 300 400 5000

500

1000

1500

2000

Respon

setime(m

s)

10 kB

1 worker 2 workers 3 workers 4 workers

200 300 400 5000

500

1000

1500

2000

100kB

40 60 80 100

TPS

0

500

1000

1500

Respon

setime(m

s)

1MB

5 10 15 20 25 30

TPS

0

1000

2000

3000

4000

10MB

Figure 7 Compression function performance

20 40 60 80 1000

500

1000

1500

2000

1 MB

5 10 15 20

Transactions per Second (TPS)

0

1000

2000

3000

Respon

setime(m

s)

10 MB

2 4 6 8 10

Transactions per Second (TPS)

0

2000

4000

6000

8000

100 MB

0 100 200 300 400 5000

500

1000

1500

2000

Respon

setime(m

s)

100 kB

1 worker 2 workers 3 workers 4 workers

Figure 8 Content-level access control function perfor-mance

results to any number of cores We empirically verified this for theimage resize use case As the cluster is made up of 6 storage nodesthe function with 1 worker was able to process around 9 images persecond for the 400kB image case and around 54 images per secondin the entire cluster For the rest of applications we utilized onlyone storage node The reason is that to tap into the performance ofthe entire cluster we would have to replicate 6 times each object toensure that a copy of it is available at every storage node In activestorage the total computation power for an object is proportionalto the number of replicas it has

Notice that in Fig 7- 10 the curves representing the responsetime start to increase when the resources of the Docker containerare overloaded ie when all the CPU cores are at the 100 of theircapacity These experiments also show how the object size is veryimportant In all experiments the higher the object size the lowerthe number of TPS a function can handle To wit for the content-level access control application (Fig 8) the function is unable tohandle more than 1 TPS for the 100MB object irrespective of thenumber of workers Concretely this task takes around 16 secondsto complete The same occurs with the image resizing function We

130

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

10 20 30 40 50 600

1000

2000

3000

4000

Respon

setime(m

s)

90 kB

1 worker 2 workers 3 workers 4 workers

10 20 30 400

500

1000

1500

2000

400 kB

5 10 15 20

Transactions per Second (TPS)

0

1000

2000

3000

4000

Respon

setime(m

s)

800 kB

2 4 6 8 10

Transactions per Second (TPS)

0

1

2

3

4times104

1200 kB

Figure 9 Image resizer function performance

100 150 200 250 3000

500

1000

1500

2000

Respon

setime(m

s)

10 kB

1 worker 2 workers 3 workers 4 workers

50 100 150 200 2500

500

1000

1500

2000

100 kB

20 40 60 80 100 120

Transactions per Second (TPS)

0

500

1000

1500

Respon

setime(m

s)

1 MB

5 10 15 20 25 30

Transactions per Second (TPS)

0

500

1000

1500

2000

10 MB

Figure 10 Signature verification function performance

saw that resizing an image of 1200kB takes around 236 secondsleading to lt 1 TPS

A function that needs more than 1 second to process an objectit can only handle in parallel as many objects as functions workersare running These examples show that function collocation at thestorage nodes would be in any case insufficient to absorb a burst oftransactions for more than one second Also they demonstrate thatit is almost impossible to predict resource consumption ahead oftime because resources depend on the object size Consequentlyresource management for collocated functions should be dynamicwhich is hard to achieve in practice instead of a simple resourcemanagement policy such as one single worker per CPU core

Performance of non-collocated functions As resource con-tention is not an issue with non-collocated functions what is keyhere is to verify that Zionrsquos storage-disaggregated functions areindeed scalable

In order to test how the Zionrsquos compute layer provides betterscalability than the storage nodes we extended the prior evaluationto non-collocated functions Our compute nodes have 24 cores eachone This means that is possible to start up to 24 function workers in

0 1 2 3 4 5 6 7 810

20

30

40

Max

TPS

90 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 85

10

15

20

25

400 kB

0 1 2 3 4 5 6 7 8

Workers

0

2

4

6

8

Max

TPS

800 kB

0 1 2 3 4 5 6 7 8

Workers

-1

0

1

2

3

1200 kB

Figure 11 Image resizer function scalability

0 1 2 3 4 5 6 7 8100

150

200

250

300

Max

TPS

10 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 850

100

150

200

250

100 kB

0 1 2 3 4 5 6 7 8

Workers

0

50

100

150

200

Max

TPS

1 MB

0 1 2 3 4 5 6 7 8

Workers

6

8

10

12

14

10 MB

Figure 12 Signature verification function scalability

each node However to make the comparative fair with the storagenodes we utilized between 1 to 8 function workers

As in the previous experiments we used 1 proxy and 1 storagenode in addition to 1 compute node which is enough to assessthe scalability of Zion The results are shown in Fig 13- 12 In thiscase we recorded the number of maximum transactions per second(Max TPS) that each worker was able to handle with and withoutcollocation

First we can see how in almost all cases with 1 to 4 functionworkers non-collocated functions can handle more transactionsthan the storage nodes This is due to the fact that the computenodes are more powerful than the storage nodes in terms of CPUTherefore the capacity of ingestion is slightly higher Second themaximum TPS that a storage node can handle is always limitedabove by the number of available cores That is spawning moreworkers has no benefit because they will contend for the availableresources This is the reason why the curve for collocated functions(gray line) flattens out beyond 4 workers in all use cases

However non-collocated functions (black line) can continue toprocess transactions by spawning more workers to respond to thedemand By a quick inspection of all the figures the scalability ofnon-collocated functions is practically linear with the number of

131

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

0 1 2 3 4 5 6 7 8200

400

600

800

1000

Max

TPS 10 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 80

200

400

600

800

100 kB

0 1 2 3 4 5 6 7 8

Workers

20

40

60

80

Max

TPS 1 MB

0 1 2 3 4 5 6 7 8

Workers

0

5

10

15

10 MB

Figure 13 Compression function scalability

0 1 2 3 4 5 6 7 80

200

400

600

Max

TPS

100 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 820

40

60

80

1 MB

0 1 2 3 4 5 6 7 8

Workers

0

5

10

Max

TPS

10 MB

0 1 2 3 4 5 6 7 8

Workers

0

1

2

3

100 MB

Figure 14 Content-level access control function scalability

workers This suggests that disaggregating storage and compute is apractical solution to scale out computation in cloud object stores

Interactive data queries Table 2 compares the execution timesfor the same query for Hadoop and Zion (Listing 2) The entries ofthis table were obtained by running 30X each configuration andthen averaging the results

Non-surprisingly the Hadoop configuration that ingests the datafrom the remote Swift service is the one that presents the highestmean execution time This occurred due to network contentionBoth clusters were geographically far apart so communicationwent through some FAST Ethernet links within our institutionrsquosLAN More concretely it took asymp 917 seconds to transfer the entire10GB dataset and asymp 86 seconds to transfer the 1GB dataset

The configuration that read the data from HDFS had better timesspecially for the large datasets However it has the disadvantagethat it requires pre-loading all the data in HDFS which may be notpractical in many cases

For Zion functions the execution times were comparativelysmall despite using at most 24 workers This suggests that bettertimes can be easily attained with more workers This is clearly seenwhen comparing Zion with 12 and 24 workers respectively For

Table 2 Interactive data queries execution times

Configuration File size Chunk size Time

Pig query - Swift 100 MB 10 MB 816sPig query - HDFS 100 MB 10 MB 714sZion - 12 Workers 100 MB 10 MB 0510sZion - 24 Workers 100 GB 10 MB 0348s

Pig query - Swift 1 GB 32 MB 156sPig query - HDFS 1 GB 32 MB 754sZion - 12 Workers 1 GB 32 MB 4183sZion - 24 Workers 1 GB 32 MB 2256s

Pig query - Swift 10 GB 200 MB 985sPig query - HDFS 10 GB 200 MB 946sZion - 12 Workers 10 GB 200 MB 26253sZion - 24 Workers 10 GB 200 MB 13392s

the 10GB dataset and 12 workers it took 2X much more time thanwith 24 workers Also this experiment confirms the scalability ofZion without incurring in resource contention in the storage clusterIf we had executed this interactive query with 24 workers in thestorage nodes the storage cluster CPU would had reached 100 inall storage nodes for 134 seconds leading to resource contention

Listing 2 Evaluated query in Hadoop and Swift + Zion clus-terss e l e c t u s e r _ i d count ( lowast ) t o t a l

where ( r e q _ t = GetContentResponse or r e q _ t = PutContentResponse )

and msg= Reques t done group by u s e r _ i d order by t o t a l DESC l imi t 10

7 CONCLUSIONThis paper presents Zion an innovative data-driven serverlesscomputing middleware for object storage Unlike commercial event-driven serverless models our data-driven functions intercept andoperate over objects as they are readwrite fromto the object storeSince Zion is serverless it overcomes the scalability and resourcecontention problems of active storage without the need to managea server or a runtime environment

By injecting computations in the data pipeline Zion is ideal foruse cases that require synchronous interactions with external usersExamples of these use cases include (but are not limited to) dynamiccontent generation interactive queries personalization contentverification and access control In many of them the data localityof our inline computations contributes to optimize latency and toreduce data transfers outside the data center

ACKNOWLEDGMENTSThis work has been partially funded by the European Union Horizon2020 Framework Programme in the context of the project IOStackSoftware-defined Storage for Big Data (H2020-644182) and by theSpanish Ministry of Science and Innovation (Grant TIN2016-77836-C2-1-R)

132

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

REFERENCES[1] Amazon 2017 Athena httpsawsamazoncomathena (2017)[2] Amazon 2017 Lambda httpsawsamazoncomlambda (2017)[3] Amazon 2017 Redshift Spectrum httpsawsamazoncomredshiftspectrum

(2017)[4] Apache 2012 HBASE Coprocessors httpsblogsapacheorghbaseentry

coprocessor_introduction (2012)[5] Apache 2017 Hadoop OpenStack Support Swift Object Store httpshadoop

apacheorgdocsstablehadoop-openstackindexhtml (2017)[6] Krste Asanovic and D Patterson 2014 Firebox A hardware building block for

2020 warehouse-scale computers In FAST[7] Prosunjit Biswas Farhan Patwa and Ravi Sandhu 2015 Content Level Access

Control for OpenStack Swift Storage In ACM CODASPY 123ndash126[8] Chao Chen Yong Chen and Philip C Roth 2012 Dosas Mitigating the resource

contention in active storage systems In CLUSTER 164ndash172[9] Facebook 2017 Presto Distributed SQL Engine for Big Data httpsprestodbio

(2017)[10] Sadjad Fouladi Riad S Wahby Brennan Shacklett Karthikeyan Vasuki Balasubra-

maniam William Zeng Rahul Bhalerao Anirudh Sivaraman George Porter andKeith Winstein 2017 Encoding Fast and Slow Low-Latency Video ProcessingUsing Thousands of Tiny Threads In NSDI

[11] Peter X Gao Akshay Narayan Sagar Karandikar Joao Carreira Sangjin HanRachit Agarwal Sylvia Ratnasamy and Scott Shenker 2016 Network require-ments for resource disaggregation In OSDI 249ndash264

[12] Google 2009 BigTable Coprocessors httpswwwcscornelleduprojectsladis2009talksdean-keynote-ladis2009pdf (2009)

[13] Google 2017 Cloud Functions httpscloudgooglecomfunctionsdocs (2017)[14] Rauacutel Gracia-Tinedo Pedro Garciacutea-Loacutepez Marc Saacutenchez-Artigas Josep Sampeacute

Yosef Moatti Eran Rom Dalit Naor Ramon Nou Toni Corteacutes Pietro Michiardiand William Oppermann 2016 IOSTACK Software-defined Object StorageIEEE Internet Computing (2016)

[15] Rauacutel Gracia-Tinedo Josep Sampeacute Edgar Zamora Marc Saacutenchez-Artigas PedroGarciacutea-Loacutepez Yosef Moatti and Eran Rom 2017 Crystal Software-DefinedStorage for Multi-Tenant Object Stores In FAST

[16] Rauacutel Gracia-Tinedo Yongchao Tian Josep Sampeacute Hamza Harkous John LentonPedro Garciacutea-Loacutepez Marc Saacutenchez-Artigas and Marko Vukolic 2015 DissectingUbuntuOne Autopsy of a Global-scale Personal Cloud Back-end In IMC 155ndash168

[17] Scott Hendrickson Stephen Sturdevant Tyler Harter VenkateshwaranVenkataramani Andrea C Arpaci-Dusseau and Remzi H Arpaci-Dusseau 2016Serverless computation with OpenLambda In HotCloud

[18] HP 2017 The Machine Our vision for the Future of Computing httpswwwlabshpecomthe-machine (2017)

[19] IBM 2017 OpenWhisk httpsdeveloperibmcomopenwhisk (2017)[20] Eric Jonas Shivaram Venkataraman Ion Stoica and Benjamin Recht 2017

Occupy the Cloud Distributed computing for the 99 arXiv preprintarXiv170204024 (2017)

[21] Microsoft 2017 Azure Functions httpsazuremicrosoftcomen-usservicesfunctions (2017)

[22] Netflix 2016 Evolution of the Netflix Data Pipeline httptechblognetflixcom201602evolution-of-netflix-data-pipelinehtml (2016)

[23] OpenStack 2017 Nova httpsdocsopenstackorgnova (2017)[24] OpenStack 2017 Swift httpsdocsopenstackorgswift (2017)[25] OpenStack 2017 Swift Large Objects httpsdocsopenstackorgswiftlatest

overview_large_objectshtml (2017)[26] Juan Piernas Jarek Nieplocha and Evan J Felix 2007 Evaluation of active storage

strategies for the lustre parallel file system In SC 28[27] Lingjun Qin and Dan Feng 2006 Active Storage Framework for Object-based

Storage Device In AINA 97ndash101[28] Simona Rabinovici-Cohen Ealan Henis John Marberg and Kenneth Nagin 2014

Storlet engine performing computations in cloud storage Technical Report IBMTechnical Report H-0320 (August 2014)

[29] Paul Rad Van Lindberg Jeff Prevost Weining Zhang and Mo Jamshidi 2014ZeroVM secure distributed processing for big data analytics In WAC 1ndash6

[30] Redis 2017 Redis httpsredisio (2017)[31] Erik Riedel Garth A Gibson and Christos Faloutsos 1998 Active Storage for

Large-Scale Data Mining and Multimedia In VLDB 62ndash73[32] Michael T Runde Wesley G Stevens Paul A Wortman and John A Chandy 2012

An active storage framework for object storage devices In MSST 1ndash12[33] Josep Sampeacute 2017 Zion framework httpsgithubcomJosepSampezion

(2017)[34] Josep Sampeacute Pedro Garciacutea-Loacutepez and Marc Saacutenchez-Artigas 2016 Vertigo

Programmable micro-controllers for software-defined object storage In IEEECLOUD 180ndash187

[35] Seung Woo Son Samuel Lang Philip Carns Robert Ross Rajeev Thakur BerkinOzisikyilmaz Prabhat Kumar Wei-Keng Liao and Alok Choudhary 2010 En-abling active storage on parallel IO software stacks In MSST 1ndash12

[36] Zhipeng Tan Yanli Yuan Dan Feng Tian Zhan and Xudong Tu 2011 Imple-mentation of method object in active object-based storage systems In CCIS204ndash211

[37] Ubuntu 2017 Cloud image repository httpscloud-imagesubuntucom (2017)[38] Ubuntu 2017 Docker repository httpshubdockercomr_ubuntu (2017)[39] Irvine University of California 1996 Adult Data Set httparchiveicsuciedu

mldatasetsAdult (1996)[40] Himanshu Vashishtha and Eleni Stroulia 2011 Enhancing query support in

hbase via an extended coprocessors framework In ServiceWave 75ndash87[41] Rajiv Wickremesinghe Jeffrey S Chase and Jeffrey Scott Vitter 2002 Distributed

computing with load-managed active storage In HPDC 13ndash23[42] Yulai Xie Dan Feng Yan Li and Darrell DE Long 2016 Oasis an active storage

framework for object storage platform Future Generation Computer Systems 56(2016) 746ndash758

133

  • Abstract
  • 1 Introduction
  • 2 Related Work
  • 3 Design Overview
    • 31 Interception Software and Metadata Service
    • 32 Computation Layer
      • 4 Implementation
        • 41 Interception Software and Metadata Service
        • 42 Computation Layer
          • 5 Applications
          • 6 Evaluation
            • 61 System Setup
            • 62 Swift Resource Contention
            • 63 Application Setup
            • 64 Application characteristics
            • 65 Results
              • 7 Conclusion
              • Acknowledgments
              • References
Page 8: Data-Driven Serverless Functions for Object Storagefaculty.washington.edu/wlloyd/courses/tcss562... · using AWS Lambda for data-intensive tasks over disaggregated storage [10, 20]

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

06

12

18

24

3

Respon

seTim

e(secon

ds) 1 Worker

2 Workers

3 Workers

4 Workers

(a) Transaction response time

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

20

40

60

80

100

CP

U U

sage

()

1 Worker

2 Workers

3 Workers

4 Workers

(b) Storage Node CPU Usage

Figure 3 Swift measurementMeasured response times (a) and CPU consumption for a given storage node (b) for getting objectsat a certain rate (transactions per second or TPS) for different values of Swift workers

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(a) 1 Object Server worker

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(b) 2 Object Server workers

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(c) 3 Object Server workers

0 200 400 600 800 1000 1200

Transactions Per Second (TPS)

0

500

1000

1500

2000

2500

3000

Respon

seTim

e(m

s)

80 Intf

60 Intf

40 Intf

20 Intf

0 Intf

(d) 4 Object Server workers

Figure 4 Swift interference measurement These plots show how for a given storage node the swift object storage service behavesfor different number of workers and for different CPU loads of interference (Intf)

4 CPU cores) At the compute side the computation layer is com-posed by 3 nodes Dell PowerEdge R430 with 32GB RAM (each onewith 24 CPU cores) All the rack machines ran Ubuntu Server 1404The client host ran Ubuntu 16041 CloudImage [37]

62 Swift Resource ContentionWe first studied how collocation of compute and data affects SwiftTo do so in ldquoidealrdquo conditions we restricted this experiment to asingle storage node2 For this measurement we used a single proxynode Dell PowerEdge R320 with 12GB RAM and 1 storage nodeDell PowerEdge R320 with 8GB RAM

Base Swift The first experiment consisted of evaluating how Swiftnormally works (Fig 3) For this experiment we stored a bunchof random 10kB files into Swift Next using the unix httperftool we ran distinct workloads each one differing in the numberof transactions per second (TPS) We measured the resultant per-transaction response time (Fig 3a) and CPU usage (Fig 3b) of thestorage node

Technically Swift uses workers to handle a workload In Swift aworker is nothing but a thread that accepts requests Each worker

2 We note that the results are easy to extrapolate to larger Swift deployments

normally accepts 1 024 concurrent requests but it is a config-urable parameter This simple experiment confirmed us that whena worker exhausts the 100 of its corersquos resources the responsetime steeply increases due to the queuing delays For instance with1 worker the corersquos usage reaches 100 around 380 TPS the pointbeyond which the requests start to accumulate as shown in Fig3a This effect can be alleviated by starting new Swift workers asseen in this figure However since storage nodes have 4 cores onlySwift is limited to 4 workers And therefore a storage node willeventually end up experiencing resource contention sooner or latereven if there are no computations collocated with the data

Collocated Computation We repeated the same experiment butcollocating computations with the storage service The results inFig 4 show how the Swiftrsquos processing capacity diminishes as afunction of the CPU usage borrowed from the collocated tasks Forexample if Swift was restricted to use only 1 worker for requestprocessing collocated tasks would have almost no impact on SwiftHowever as the number of Swift workers increase to match thenumber of CPU cores the resource contention begins to producenegative effects in the storage system due to the interference fromthe collocated computations

128

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

Swiftrsquos best practices recommend to use as many workers asCPU cores Under a heavy use of the CPU by the collocated tasksthis figure shows that Swift request processing would be severelydiminished For instance when the collocated tasks consumed 80of the total CPU Swift processing fell to a 35 in this experimentThis result suggests that the right path to go is to move computationsoutside the storage nodes which is what we did with Zion

63 Application SetupNow we describe the specific evaluation setup for our example usecases As object stores can contain a huge diversity of object typesand sizes [16] we set up heterogeneous workloads for all use casesin order to verify the behavior of Zion under different scenarios

Content-level access control For testing this application weused the dataset described in Section 5 but trimmed down to differ-ent object sizes 100kB 1MB 10MB and 100MB As stated beforethe dataset content is filtered out according to the type of userthat requested the object To this aim we used Swift user roles toreturn only specific fields to each user The function associated tothe onAfterGet trigger reads from an implicit parameter the JSONstring containing the allowed fields for a specific userrsquos role (egage education marital-status) and then returns them to theuser

Compression It is well-known that the compression ratio of ob-jects affects resource consumption Objects that contain only zeroswill be compressed more quickly and consume less resources thancompressing a binary object To get rid of this issue we chose touse text documents with a similar compression ratio of around60 The documents were of sizes 10kB 100kB 1MB and 10MBrespectively

Image processing As in the previous case we tried to find themost common scenario for testing this function We focused onthose image sizes commonly used in static web pages Finally weused different jpg files of 90kB 400kB 800kB and 1200 kB andwe set an implicit parameter so that the function resizes the imageto the 50 of its original size

Signature verification A sample usage of this function may be toverify official documents and forms in a public institution For thisexperiment we used text documents also of different sizes 10 kB100 kB 1 MB and 10 MB These documents are signed with a RSAprivate key The experiment operates on PUT requests verifyingthe correction of the signature

Interactive data queries and result aggregation For this usecase we used different sizes of the publicly available UbuntuOnersquoslog file [16] 100MB 1GB and 10GB respectively We comparedZionrsquos functions execution time to those obtained using HadoopWe built a 10-node Hadoop cluster of commodity workstations4-core i5 at 253 GHZ and 16 GB of RAM For this application weissued a Pig query against 1 The log files stored in HDFS and 2Swift using the Apache Hadoop-Swift connector [5] For Zion first-stage filtering functions we picked chunk sizes of 10MB 32MBand 200MB for the 100MB 1GB and 10GB files respectively

Table 1 Function information

Application LOC Function Size

Content-level Access Control asymp 29 27 kBCompression asymp 8 18 kBImage processing asymp 17 23 kBSignature verification asymp 43 29 kBInteractive data query asymp 203 6 kB

08 085 09 095 1

Startup Time (seconds)

0

30

60

90

120

150

Numof

Dokcers

Figure 5 Zion runtime startup time Time in seconds that aDocker with the Zion Java runtime takes to start

64 Application characteristicsTable 1 shows information of our four Zion applications The LOCcolumn gives the number of lines of code required to execute thefunction The third column gives the Function size used for eachapplication From this table it can be seen that our functions arevery lightweight for the proposed applications

65 Results

Docker characteristics The Java runtime and then the functionsare run inside Docker containers in our prototype Consequentlythe first validation to do was to assess the impact of starting ourruntime within Docker containers If the startup time was too largeit would hamper the inline processing capabilities of Zion Ourexperiments however revealed that this is not the case

For this experiment we utilized a pre-compiled Docker imagewith our Java runtime integrated in it The experiment consisted oflaunching 1 000 containers at different rates in the compute clusterand measure the startup time and memory along with the amountof memory for keeping the runtime up and running over time

Fig 5 shows the results of this experiment As depicted startinga new Docker with the Java runtime takes between 085 minus 095seconds Regarding RAM consumption we got that each containerconsumes around 35MB of memory These results show how ourruntime takes 5X times less to start in comparison with the AWSLambdarsquos runtime which proves that Zion is lightweight enoughfor elastic inline processing As Zion utilizes a large pool of alreadystarted containers the start up time is typically negligible in manycases and only amounts to 09 secs if there are no free containersin the pool

129

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

001 0015 002 0025 003 0035 004

(a) Measured request times with Swift

001 0015 002 0025 003 0035 004

(b) Measured request times with Swift + Zion

Figure 6 Zion base overhead These boxplots show measuredtimes of requesting 10kB objects in Swift (a) and gettingthe same objects with a no-operation (nop) function troughZion (b)

Zion overhead As another basic experiment we studied the extraoverhead that Zion adds when running a function The overheadis the time needed by the interception middleware to redirect theobject to a compute node take an available container launch afunction inside the Docker and pass the object data through it Forthis experiment we made use of the function listed in Listing 1

This simple function iterates over the data without processingit For measuring the overhead we used a set of 10kB objects Wefirst launched 5 000 plain GET requests to measure the base timeneeded to complete a GET request Then we launched another5 000 GET requests for the same objects but in this case we set upthe function in Listing 1 to respond upon the onAfterGet triggerof the objects

The results are plotted in Fig 6 This figure shows that the Zionrsquosoverhead is 9 ms This time includes the penalty of 5 ms for theaddition of an internal hop to Swift (redirect the data through acompute node) plus 4 ms to determine whether and which functionto run inside a free Docker container in the pool when the requesttouches a compute node

Performance of collocated functions We evaluated here theperformance of our application functions For this evaluation weconducted a stress test for each combination of application objectsize and number of function workers As our storage nodes have 4CPU cores we ran each function on 1 2 3 and 4 function workersrespectively The Docker containers were set up as follows 1 CPUcore and 512MB of RAM per worker The results are depicted inFig 7-10

The first experiments were done collocating the containers inthe storage nodes without using the Zion computation layer Themain objective of this experiment was 1 To show how collocationlimits the number of TPS and 2 To show the performance of Zionrsquosfunctions using 1 CPU core per function worker As a side effectnotice that the latter makes it possible to faithfully generalize our

200 300 400 5000

500

1000

1500

2000

Respon

setime(m

s)

10 kB

1 worker 2 workers 3 workers 4 workers

200 300 400 5000

500

1000

1500

2000

100kB

40 60 80 100

TPS

0

500

1000

1500

Respon

setime(m

s)

1MB

5 10 15 20 25 30

TPS

0

1000

2000

3000

4000

10MB

Figure 7 Compression function performance

20 40 60 80 1000

500

1000

1500

2000

1 MB

5 10 15 20

Transactions per Second (TPS)

0

1000

2000

3000

Respon

setime(m

s)

10 MB

2 4 6 8 10

Transactions per Second (TPS)

0

2000

4000

6000

8000

100 MB

0 100 200 300 400 5000

500

1000

1500

2000

Respon

setime(m

s)

100 kB

1 worker 2 workers 3 workers 4 workers

Figure 8 Content-level access control function perfor-mance

results to any number of cores We empirically verified this for theimage resize use case As the cluster is made up of 6 storage nodesthe function with 1 worker was able to process around 9 images persecond for the 400kB image case and around 54 images per secondin the entire cluster For the rest of applications we utilized onlyone storage node The reason is that to tap into the performance ofthe entire cluster we would have to replicate 6 times each object toensure that a copy of it is available at every storage node In activestorage the total computation power for an object is proportionalto the number of replicas it has

Notice that in Fig 7- 10 the curves representing the responsetime start to increase when the resources of the Docker containerare overloaded ie when all the CPU cores are at the 100 of theircapacity These experiments also show how the object size is veryimportant In all experiments the higher the object size the lowerthe number of TPS a function can handle To wit for the content-level access control application (Fig 8) the function is unable tohandle more than 1 TPS for the 100MB object irrespective of thenumber of workers Concretely this task takes around 16 secondsto complete The same occurs with the image resizing function We

130

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

10 20 30 40 50 600

1000

2000

3000

4000

Respon

setime(m

s)

90 kB

1 worker 2 workers 3 workers 4 workers

10 20 30 400

500

1000

1500

2000

400 kB

5 10 15 20

Transactions per Second (TPS)

0

1000

2000

3000

4000

Respon

setime(m

s)

800 kB

2 4 6 8 10

Transactions per Second (TPS)

0

1

2

3

4times104

1200 kB

Figure 9 Image resizer function performance

100 150 200 250 3000

500

1000

1500

2000

Respon

setime(m

s)

10 kB

1 worker 2 workers 3 workers 4 workers

50 100 150 200 2500

500

1000

1500

2000

100 kB

20 40 60 80 100 120

Transactions per Second (TPS)

0

500

1000

1500

Respon

setime(m

s)

1 MB

5 10 15 20 25 30

Transactions per Second (TPS)

0

500

1000

1500

2000

10 MB

Figure 10 Signature verification function performance

saw that resizing an image of 1200kB takes around 236 secondsleading to lt 1 TPS

A function that needs more than 1 second to process an objectit can only handle in parallel as many objects as functions workersare running These examples show that function collocation at thestorage nodes would be in any case insufficient to absorb a burst oftransactions for more than one second Also they demonstrate thatit is almost impossible to predict resource consumption ahead oftime because resources depend on the object size Consequentlyresource management for collocated functions should be dynamicwhich is hard to achieve in practice instead of a simple resourcemanagement policy such as one single worker per CPU core

Performance of non-collocated functions As resource con-tention is not an issue with non-collocated functions what is keyhere is to verify that Zionrsquos storage-disaggregated functions areindeed scalable

In order to test how the Zionrsquos compute layer provides betterscalability than the storage nodes we extended the prior evaluationto non-collocated functions Our compute nodes have 24 cores eachone This means that is possible to start up to 24 function workers in

0 1 2 3 4 5 6 7 810

20

30

40

Max

TPS

90 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 85

10

15

20

25

400 kB

0 1 2 3 4 5 6 7 8

Workers

0

2

4

6

8

Max

TPS

800 kB

0 1 2 3 4 5 6 7 8

Workers

-1

0

1

2

3

1200 kB

Figure 11 Image resizer function scalability

0 1 2 3 4 5 6 7 8100

150

200

250

300

Max

TPS

10 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 850

100

150

200

250

100 kB

0 1 2 3 4 5 6 7 8

Workers

0

50

100

150

200

Max

TPS

1 MB

0 1 2 3 4 5 6 7 8

Workers

6

8

10

12

14

10 MB

Figure 12 Signature verification function scalability

each node However to make the comparative fair with the storagenodes we utilized between 1 to 8 function workers

As in the previous experiments we used 1 proxy and 1 storagenode in addition to 1 compute node which is enough to assessthe scalability of Zion The results are shown in Fig 13- 12 In thiscase we recorded the number of maximum transactions per second(Max TPS) that each worker was able to handle with and withoutcollocation

First we can see how in almost all cases with 1 to 4 functionworkers non-collocated functions can handle more transactionsthan the storage nodes This is due to the fact that the computenodes are more powerful than the storage nodes in terms of CPUTherefore the capacity of ingestion is slightly higher Second themaximum TPS that a storage node can handle is always limitedabove by the number of available cores That is spawning moreworkers has no benefit because they will contend for the availableresources This is the reason why the curve for collocated functions(gray line) flattens out beyond 4 workers in all use cases

However non-collocated functions (black line) can continue toprocess transactions by spawning more workers to respond to thedemand By a quick inspection of all the figures the scalability ofnon-collocated functions is practically linear with the number of

131

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

0 1 2 3 4 5 6 7 8200

400

600

800

1000

Max

TPS 10 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 80

200

400

600

800

100 kB

0 1 2 3 4 5 6 7 8

Workers

20

40

60

80

Max

TPS 1 MB

0 1 2 3 4 5 6 7 8

Workers

0

5

10

15

10 MB

Figure 13 Compression function scalability

0 1 2 3 4 5 6 7 80

200

400

600

Max

TPS

100 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 820

40

60

80

1 MB

0 1 2 3 4 5 6 7 8

Workers

0

5

10

Max

TPS

10 MB

0 1 2 3 4 5 6 7 8

Workers

0

1

2

3

100 MB

Figure 14 Content-level access control function scalability

workers This suggests that disaggregating storage and compute is apractical solution to scale out computation in cloud object stores

Interactive data queries Table 2 compares the execution timesfor the same query for Hadoop and Zion (Listing 2) The entries ofthis table were obtained by running 30X each configuration andthen averaging the results

Non-surprisingly the Hadoop configuration that ingests the datafrom the remote Swift service is the one that presents the highestmean execution time This occurred due to network contentionBoth clusters were geographically far apart so communicationwent through some FAST Ethernet links within our institutionrsquosLAN More concretely it took asymp 917 seconds to transfer the entire10GB dataset and asymp 86 seconds to transfer the 1GB dataset

The configuration that read the data from HDFS had better timesspecially for the large datasets However it has the disadvantagethat it requires pre-loading all the data in HDFS which may be notpractical in many cases

For Zion functions the execution times were comparativelysmall despite using at most 24 workers This suggests that bettertimes can be easily attained with more workers This is clearly seenwhen comparing Zion with 12 and 24 workers respectively For

Table 2 Interactive data queries execution times

Configuration File size Chunk size Time

Pig query - Swift 100 MB 10 MB 816sPig query - HDFS 100 MB 10 MB 714sZion - 12 Workers 100 MB 10 MB 0510sZion - 24 Workers 100 GB 10 MB 0348s

Pig query - Swift 1 GB 32 MB 156sPig query - HDFS 1 GB 32 MB 754sZion - 12 Workers 1 GB 32 MB 4183sZion - 24 Workers 1 GB 32 MB 2256s

Pig query - Swift 10 GB 200 MB 985sPig query - HDFS 10 GB 200 MB 946sZion - 12 Workers 10 GB 200 MB 26253sZion - 24 Workers 10 GB 200 MB 13392s

the 10GB dataset and 12 workers it took 2X much more time thanwith 24 workers Also this experiment confirms the scalability ofZion without incurring in resource contention in the storage clusterIf we had executed this interactive query with 24 workers in thestorage nodes the storage cluster CPU would had reached 100 inall storage nodes for 134 seconds leading to resource contention

Listing 2 Evaluated query in Hadoop and Swift + Zion clus-terss e l e c t u s e r _ i d count ( lowast ) t o t a l

where ( r e q _ t = GetContentResponse or r e q _ t = PutContentResponse )

and msg= Reques t done group by u s e r _ i d order by t o t a l DESC l imi t 10

7 CONCLUSIONThis paper presents Zion an innovative data-driven serverlesscomputing middleware for object storage Unlike commercial event-driven serverless models our data-driven functions intercept andoperate over objects as they are readwrite fromto the object storeSince Zion is serverless it overcomes the scalability and resourcecontention problems of active storage without the need to managea server or a runtime environment

By injecting computations in the data pipeline Zion is ideal foruse cases that require synchronous interactions with external usersExamples of these use cases include (but are not limited to) dynamiccontent generation interactive queries personalization contentverification and access control In many of them the data localityof our inline computations contributes to optimize latency and toreduce data transfers outside the data center

ACKNOWLEDGMENTSThis work has been partially funded by the European Union Horizon2020 Framework Programme in the context of the project IOStackSoftware-defined Storage for Big Data (H2020-644182) and by theSpanish Ministry of Science and Innovation (Grant TIN2016-77836-C2-1-R)

132

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

REFERENCES[1] Amazon 2017 Athena httpsawsamazoncomathena (2017)[2] Amazon 2017 Lambda httpsawsamazoncomlambda (2017)[3] Amazon 2017 Redshift Spectrum httpsawsamazoncomredshiftspectrum

(2017)[4] Apache 2012 HBASE Coprocessors httpsblogsapacheorghbaseentry

coprocessor_introduction (2012)[5] Apache 2017 Hadoop OpenStack Support Swift Object Store httpshadoop

apacheorgdocsstablehadoop-openstackindexhtml (2017)[6] Krste Asanovic and D Patterson 2014 Firebox A hardware building block for

2020 warehouse-scale computers In FAST[7] Prosunjit Biswas Farhan Patwa and Ravi Sandhu 2015 Content Level Access

Control for OpenStack Swift Storage In ACM CODASPY 123ndash126[8] Chao Chen Yong Chen and Philip C Roth 2012 Dosas Mitigating the resource

contention in active storage systems In CLUSTER 164ndash172[9] Facebook 2017 Presto Distributed SQL Engine for Big Data httpsprestodbio

(2017)[10] Sadjad Fouladi Riad S Wahby Brennan Shacklett Karthikeyan Vasuki Balasubra-

maniam William Zeng Rahul Bhalerao Anirudh Sivaraman George Porter andKeith Winstein 2017 Encoding Fast and Slow Low-Latency Video ProcessingUsing Thousands of Tiny Threads In NSDI

[11] Peter X Gao Akshay Narayan Sagar Karandikar Joao Carreira Sangjin HanRachit Agarwal Sylvia Ratnasamy and Scott Shenker 2016 Network require-ments for resource disaggregation In OSDI 249ndash264

[12] Google 2009 BigTable Coprocessors httpswwwcscornelleduprojectsladis2009talksdean-keynote-ladis2009pdf (2009)

[13] Google 2017 Cloud Functions httpscloudgooglecomfunctionsdocs (2017)[14] Rauacutel Gracia-Tinedo Pedro Garciacutea-Loacutepez Marc Saacutenchez-Artigas Josep Sampeacute

Yosef Moatti Eran Rom Dalit Naor Ramon Nou Toni Corteacutes Pietro Michiardiand William Oppermann 2016 IOSTACK Software-defined Object StorageIEEE Internet Computing (2016)

[15] Rauacutel Gracia-Tinedo Josep Sampeacute Edgar Zamora Marc Saacutenchez-Artigas PedroGarciacutea-Loacutepez Yosef Moatti and Eran Rom 2017 Crystal Software-DefinedStorage for Multi-Tenant Object Stores In FAST

[16] Rauacutel Gracia-Tinedo Yongchao Tian Josep Sampeacute Hamza Harkous John LentonPedro Garciacutea-Loacutepez Marc Saacutenchez-Artigas and Marko Vukolic 2015 DissectingUbuntuOne Autopsy of a Global-scale Personal Cloud Back-end In IMC 155ndash168

[17] Scott Hendrickson Stephen Sturdevant Tyler Harter VenkateshwaranVenkataramani Andrea C Arpaci-Dusseau and Remzi H Arpaci-Dusseau 2016Serverless computation with OpenLambda In HotCloud

[18] HP 2017 The Machine Our vision for the Future of Computing httpswwwlabshpecomthe-machine (2017)

[19] IBM 2017 OpenWhisk httpsdeveloperibmcomopenwhisk (2017)[20] Eric Jonas Shivaram Venkataraman Ion Stoica and Benjamin Recht 2017

Occupy the Cloud Distributed computing for the 99 arXiv preprintarXiv170204024 (2017)

[21] Microsoft 2017 Azure Functions httpsazuremicrosoftcomen-usservicesfunctions (2017)

[22] Netflix 2016 Evolution of the Netflix Data Pipeline httptechblognetflixcom201602evolution-of-netflix-data-pipelinehtml (2016)

[23] OpenStack 2017 Nova httpsdocsopenstackorgnova (2017)[24] OpenStack 2017 Swift httpsdocsopenstackorgswift (2017)[25] OpenStack 2017 Swift Large Objects httpsdocsopenstackorgswiftlatest

overview_large_objectshtml (2017)[26] Juan Piernas Jarek Nieplocha and Evan J Felix 2007 Evaluation of active storage

strategies for the lustre parallel file system In SC 28[27] Lingjun Qin and Dan Feng 2006 Active Storage Framework for Object-based

Storage Device In AINA 97ndash101[28] Simona Rabinovici-Cohen Ealan Henis John Marberg and Kenneth Nagin 2014

Storlet engine performing computations in cloud storage Technical Report IBMTechnical Report H-0320 (August 2014)

[29] Paul Rad Van Lindberg Jeff Prevost Weining Zhang and Mo Jamshidi 2014ZeroVM secure distributed processing for big data analytics In WAC 1ndash6

[30] Redis 2017 Redis httpsredisio (2017)[31] Erik Riedel Garth A Gibson and Christos Faloutsos 1998 Active Storage for

Large-Scale Data Mining and Multimedia In VLDB 62ndash73[32] Michael T Runde Wesley G Stevens Paul A Wortman and John A Chandy 2012

An active storage framework for object storage devices In MSST 1ndash12[33] Josep Sampeacute 2017 Zion framework httpsgithubcomJosepSampezion

(2017)[34] Josep Sampeacute Pedro Garciacutea-Loacutepez and Marc Saacutenchez-Artigas 2016 Vertigo

Programmable micro-controllers for software-defined object storage In IEEECLOUD 180ndash187

[35] Seung Woo Son Samuel Lang Philip Carns Robert Ross Rajeev Thakur BerkinOzisikyilmaz Prabhat Kumar Wei-Keng Liao and Alok Choudhary 2010 En-abling active storage on parallel IO software stacks In MSST 1ndash12

[36] Zhipeng Tan Yanli Yuan Dan Feng Tian Zhan and Xudong Tu 2011 Imple-mentation of method object in active object-based storage systems In CCIS204ndash211

[37] Ubuntu 2017 Cloud image repository httpscloud-imagesubuntucom (2017)[38] Ubuntu 2017 Docker repository httpshubdockercomr_ubuntu (2017)[39] Irvine University of California 1996 Adult Data Set httparchiveicsuciedu

mldatasetsAdult (1996)[40] Himanshu Vashishtha and Eleni Stroulia 2011 Enhancing query support in

hbase via an extended coprocessors framework In ServiceWave 75ndash87[41] Rajiv Wickremesinghe Jeffrey S Chase and Jeffrey Scott Vitter 2002 Distributed

computing with load-managed active storage In HPDC 13ndash23[42] Yulai Xie Dan Feng Yan Li and Darrell DE Long 2016 Oasis an active storage

framework for object storage platform Future Generation Computer Systems 56(2016) 746ndash758

133

  • Abstract
  • 1 Introduction
  • 2 Related Work
  • 3 Design Overview
    • 31 Interception Software and Metadata Service
    • 32 Computation Layer
      • 4 Implementation
        • 41 Interception Software and Metadata Service
        • 42 Computation Layer
          • 5 Applications
          • 6 Evaluation
            • 61 System Setup
            • 62 Swift Resource Contention
            • 63 Application Setup
            • 64 Application characteristics
            • 65 Results
              • 7 Conclusion
              • Acknowledgments
              • References
Page 9: Data-Driven Serverless Functions for Object Storagefaculty.washington.edu/wlloyd/courses/tcss562... · using AWS Lambda for data-intensive tasks over disaggregated storage [10, 20]

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

Swiftrsquos best practices recommend to use as many workers asCPU cores Under a heavy use of the CPU by the collocated tasksthis figure shows that Swift request processing would be severelydiminished For instance when the collocated tasks consumed 80of the total CPU Swift processing fell to a 35 in this experimentThis result suggests that the right path to go is to move computationsoutside the storage nodes which is what we did with Zion

63 Application SetupNow we describe the specific evaluation setup for our example usecases As object stores can contain a huge diversity of object typesand sizes [16] we set up heterogeneous workloads for all use casesin order to verify the behavior of Zion under different scenarios

Content-level access control For testing this application weused the dataset described in Section 5 but trimmed down to differ-ent object sizes 100kB 1MB 10MB and 100MB As stated beforethe dataset content is filtered out according to the type of userthat requested the object To this aim we used Swift user roles toreturn only specific fields to each user The function associated tothe onAfterGet trigger reads from an implicit parameter the JSONstring containing the allowed fields for a specific userrsquos role (egage education marital-status) and then returns them to theuser

Compression It is well-known that the compression ratio of ob-jects affects resource consumption Objects that contain only zeroswill be compressed more quickly and consume less resources thancompressing a binary object To get rid of this issue we chose touse text documents with a similar compression ratio of around60 The documents were of sizes 10kB 100kB 1MB and 10MBrespectively

Image processing As in the previous case we tried to find themost common scenario for testing this function We focused onthose image sizes commonly used in static web pages Finally weused different jpg files of 90kB 400kB 800kB and 1200 kB andwe set an implicit parameter so that the function resizes the imageto the 50 of its original size

Signature verification A sample usage of this function may be toverify official documents and forms in a public institution For thisexperiment we used text documents also of different sizes 10 kB100 kB 1 MB and 10 MB These documents are signed with a RSAprivate key The experiment operates on PUT requests verifyingthe correction of the signature

Interactive data queries and result aggregation For this usecase we used different sizes of the publicly available UbuntuOnersquoslog file [16] 100MB 1GB and 10GB respectively We comparedZionrsquos functions execution time to those obtained using HadoopWe built a 10-node Hadoop cluster of commodity workstations4-core i5 at 253 GHZ and 16 GB of RAM For this application weissued a Pig query against 1 The log files stored in HDFS and 2Swift using the Apache Hadoop-Swift connector [5] For Zion first-stage filtering functions we picked chunk sizes of 10MB 32MBand 200MB for the 100MB 1GB and 10GB files respectively

Table 1 Function information

Application LOC Function Size

Content-level Access Control asymp 29 27 kBCompression asymp 8 18 kBImage processing asymp 17 23 kBSignature verification asymp 43 29 kBInteractive data query asymp 203 6 kB

08 085 09 095 1

Startup Time (seconds)

0

30

60

90

120

150

Numof

Dokcers

Figure 5 Zion runtime startup time Time in seconds that aDocker with the Zion Java runtime takes to start

64 Application characteristicsTable 1 shows information of our four Zion applications The LOCcolumn gives the number of lines of code required to execute thefunction The third column gives the Function size used for eachapplication From this table it can be seen that our functions arevery lightweight for the proposed applications

65 Results

Docker characteristics The Java runtime and then the functionsare run inside Docker containers in our prototype Consequentlythe first validation to do was to assess the impact of starting ourruntime within Docker containers If the startup time was too largeit would hamper the inline processing capabilities of Zion Ourexperiments however revealed that this is not the case

For this experiment we utilized a pre-compiled Docker imagewith our Java runtime integrated in it The experiment consisted oflaunching 1 000 containers at different rates in the compute clusterand measure the startup time and memory along with the amountof memory for keeping the runtime up and running over time

Fig 5 shows the results of this experiment As depicted startinga new Docker with the Java runtime takes between 085 minus 095seconds Regarding RAM consumption we got that each containerconsumes around 35MB of memory These results show how ourruntime takes 5X times less to start in comparison with the AWSLambdarsquos runtime which proves that Zion is lightweight enoughfor elastic inline processing As Zion utilizes a large pool of alreadystarted containers the start up time is typically negligible in manycases and only amounts to 09 secs if there are no free containersin the pool

129

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

001 0015 002 0025 003 0035 004

(a) Measured request times with Swift

001 0015 002 0025 003 0035 004

(b) Measured request times with Swift + Zion

Figure 6 Zion base overhead These boxplots show measuredtimes of requesting 10kB objects in Swift (a) and gettingthe same objects with a no-operation (nop) function troughZion (b)

Zion overhead As another basic experiment we studied the extraoverhead that Zion adds when running a function The overheadis the time needed by the interception middleware to redirect theobject to a compute node take an available container launch afunction inside the Docker and pass the object data through it Forthis experiment we made use of the function listed in Listing 1

This simple function iterates over the data without processingit For measuring the overhead we used a set of 10kB objects Wefirst launched 5 000 plain GET requests to measure the base timeneeded to complete a GET request Then we launched another5 000 GET requests for the same objects but in this case we set upthe function in Listing 1 to respond upon the onAfterGet triggerof the objects

The results are plotted in Fig 6 This figure shows that the Zionrsquosoverhead is 9 ms This time includes the penalty of 5 ms for theaddition of an internal hop to Swift (redirect the data through acompute node) plus 4 ms to determine whether and which functionto run inside a free Docker container in the pool when the requesttouches a compute node

Performance of collocated functions We evaluated here theperformance of our application functions For this evaluation weconducted a stress test for each combination of application objectsize and number of function workers As our storage nodes have 4CPU cores we ran each function on 1 2 3 and 4 function workersrespectively The Docker containers were set up as follows 1 CPUcore and 512MB of RAM per worker The results are depicted inFig 7-10

The first experiments were done collocating the containers inthe storage nodes without using the Zion computation layer Themain objective of this experiment was 1 To show how collocationlimits the number of TPS and 2 To show the performance of Zionrsquosfunctions using 1 CPU core per function worker As a side effectnotice that the latter makes it possible to faithfully generalize our

200 300 400 5000

500

1000

1500

2000

Respon

setime(m

s)

10 kB

1 worker 2 workers 3 workers 4 workers

200 300 400 5000

500

1000

1500

2000

100kB

40 60 80 100

TPS

0

500

1000

1500

Respon

setime(m

s)

1MB

5 10 15 20 25 30

TPS

0

1000

2000

3000

4000

10MB

Figure 7 Compression function performance

20 40 60 80 1000

500

1000

1500

2000

1 MB

5 10 15 20

Transactions per Second (TPS)

0

1000

2000

3000

Respon

setime(m

s)

10 MB

2 4 6 8 10

Transactions per Second (TPS)

0

2000

4000

6000

8000

100 MB

0 100 200 300 400 5000

500

1000

1500

2000

Respon

setime(m

s)

100 kB

1 worker 2 workers 3 workers 4 workers

Figure 8 Content-level access control function perfor-mance

results to any number of cores We empirically verified this for theimage resize use case As the cluster is made up of 6 storage nodesthe function with 1 worker was able to process around 9 images persecond for the 400kB image case and around 54 images per secondin the entire cluster For the rest of applications we utilized onlyone storage node The reason is that to tap into the performance ofthe entire cluster we would have to replicate 6 times each object toensure that a copy of it is available at every storage node In activestorage the total computation power for an object is proportionalto the number of replicas it has

Notice that in Fig 7- 10 the curves representing the responsetime start to increase when the resources of the Docker containerare overloaded ie when all the CPU cores are at the 100 of theircapacity These experiments also show how the object size is veryimportant In all experiments the higher the object size the lowerthe number of TPS a function can handle To wit for the content-level access control application (Fig 8) the function is unable tohandle more than 1 TPS for the 100MB object irrespective of thenumber of workers Concretely this task takes around 16 secondsto complete The same occurs with the image resizing function We

130

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

10 20 30 40 50 600

1000

2000

3000

4000

Respon

setime(m

s)

90 kB

1 worker 2 workers 3 workers 4 workers

10 20 30 400

500

1000

1500

2000

400 kB

5 10 15 20

Transactions per Second (TPS)

0

1000

2000

3000

4000

Respon

setime(m

s)

800 kB

2 4 6 8 10

Transactions per Second (TPS)

0

1

2

3

4times104

1200 kB

Figure 9 Image resizer function performance

100 150 200 250 3000

500

1000

1500

2000

Respon

setime(m

s)

10 kB

1 worker 2 workers 3 workers 4 workers

50 100 150 200 2500

500

1000

1500

2000

100 kB

20 40 60 80 100 120

Transactions per Second (TPS)

0

500

1000

1500

Respon

setime(m

s)

1 MB

5 10 15 20 25 30

Transactions per Second (TPS)

0

500

1000

1500

2000

10 MB

Figure 10 Signature verification function performance

saw that resizing an image of 1200kB takes around 236 secondsleading to lt 1 TPS

A function that needs more than 1 second to process an objectit can only handle in parallel as many objects as functions workersare running These examples show that function collocation at thestorage nodes would be in any case insufficient to absorb a burst oftransactions for more than one second Also they demonstrate thatit is almost impossible to predict resource consumption ahead oftime because resources depend on the object size Consequentlyresource management for collocated functions should be dynamicwhich is hard to achieve in practice instead of a simple resourcemanagement policy such as one single worker per CPU core

Performance of non-collocated functions As resource con-tention is not an issue with non-collocated functions what is keyhere is to verify that Zionrsquos storage-disaggregated functions areindeed scalable

In order to test how the Zionrsquos compute layer provides betterscalability than the storage nodes we extended the prior evaluationto non-collocated functions Our compute nodes have 24 cores eachone This means that is possible to start up to 24 function workers in

0 1 2 3 4 5 6 7 810

20

30

40

Max

TPS

90 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 85

10

15

20

25

400 kB

0 1 2 3 4 5 6 7 8

Workers

0

2

4

6

8

Max

TPS

800 kB

0 1 2 3 4 5 6 7 8

Workers

-1

0

1

2

3

1200 kB

Figure 11 Image resizer function scalability

0 1 2 3 4 5 6 7 8100

150

200

250

300

Max

TPS

10 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 850

100

150

200

250

100 kB

0 1 2 3 4 5 6 7 8

Workers

0

50

100

150

200

Max

TPS

1 MB

0 1 2 3 4 5 6 7 8

Workers

6

8

10

12

14

10 MB

Figure 12 Signature verification function scalability

each node However to make the comparative fair with the storagenodes we utilized between 1 to 8 function workers

As in the previous experiments we used 1 proxy and 1 storagenode in addition to 1 compute node which is enough to assessthe scalability of Zion The results are shown in Fig 13- 12 In thiscase we recorded the number of maximum transactions per second(Max TPS) that each worker was able to handle with and withoutcollocation

First we can see how in almost all cases with 1 to 4 functionworkers non-collocated functions can handle more transactionsthan the storage nodes This is due to the fact that the computenodes are more powerful than the storage nodes in terms of CPUTherefore the capacity of ingestion is slightly higher Second themaximum TPS that a storage node can handle is always limitedabove by the number of available cores That is spawning moreworkers has no benefit because they will contend for the availableresources This is the reason why the curve for collocated functions(gray line) flattens out beyond 4 workers in all use cases

However non-collocated functions (black line) can continue toprocess transactions by spawning more workers to respond to thedemand By a quick inspection of all the figures the scalability ofnon-collocated functions is practically linear with the number of

131

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

0 1 2 3 4 5 6 7 8200

400

600

800

1000

Max

TPS 10 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 80

200

400

600

800

100 kB

0 1 2 3 4 5 6 7 8

Workers

20

40

60

80

Max

TPS 1 MB

0 1 2 3 4 5 6 7 8

Workers

0

5

10

15

10 MB

Figure 13 Compression function scalability

0 1 2 3 4 5 6 7 80

200

400

600

Max

TPS

100 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 820

40

60

80

1 MB

0 1 2 3 4 5 6 7 8

Workers

0

5

10

Max

TPS

10 MB

0 1 2 3 4 5 6 7 8

Workers

0

1

2

3

100 MB

Figure 14 Content-level access control function scalability

workers This suggests that disaggregating storage and compute is apractical solution to scale out computation in cloud object stores

Interactive data queries Table 2 compares the execution timesfor the same query for Hadoop and Zion (Listing 2) The entries ofthis table were obtained by running 30X each configuration andthen averaging the results

Non-surprisingly the Hadoop configuration that ingests the datafrom the remote Swift service is the one that presents the highestmean execution time This occurred due to network contentionBoth clusters were geographically far apart so communicationwent through some FAST Ethernet links within our institutionrsquosLAN More concretely it took asymp 917 seconds to transfer the entire10GB dataset and asymp 86 seconds to transfer the 1GB dataset

The configuration that read the data from HDFS had better timesspecially for the large datasets However it has the disadvantagethat it requires pre-loading all the data in HDFS which may be notpractical in many cases

For Zion functions the execution times were comparativelysmall despite using at most 24 workers This suggests that bettertimes can be easily attained with more workers This is clearly seenwhen comparing Zion with 12 and 24 workers respectively For

Table 2 Interactive data queries execution times

Configuration File size Chunk size Time

Pig query - Swift 100 MB 10 MB 816sPig query - HDFS 100 MB 10 MB 714sZion - 12 Workers 100 MB 10 MB 0510sZion - 24 Workers 100 GB 10 MB 0348s

Pig query - Swift 1 GB 32 MB 156sPig query - HDFS 1 GB 32 MB 754sZion - 12 Workers 1 GB 32 MB 4183sZion - 24 Workers 1 GB 32 MB 2256s

Pig query - Swift 10 GB 200 MB 985sPig query - HDFS 10 GB 200 MB 946sZion - 12 Workers 10 GB 200 MB 26253sZion - 24 Workers 10 GB 200 MB 13392s

the 10GB dataset and 12 workers it took 2X much more time thanwith 24 workers Also this experiment confirms the scalability ofZion without incurring in resource contention in the storage clusterIf we had executed this interactive query with 24 workers in thestorage nodes the storage cluster CPU would had reached 100 inall storage nodes for 134 seconds leading to resource contention

Listing 2 Evaluated query in Hadoop and Swift + Zion clus-terss e l e c t u s e r _ i d count ( lowast ) t o t a l

where ( r e q _ t = GetContentResponse or r e q _ t = PutContentResponse )

and msg= Reques t done group by u s e r _ i d order by t o t a l DESC l imi t 10

7 CONCLUSIONThis paper presents Zion an innovative data-driven serverlesscomputing middleware for object storage Unlike commercial event-driven serverless models our data-driven functions intercept andoperate over objects as they are readwrite fromto the object storeSince Zion is serverless it overcomes the scalability and resourcecontention problems of active storage without the need to managea server or a runtime environment

By injecting computations in the data pipeline Zion is ideal foruse cases that require synchronous interactions with external usersExamples of these use cases include (but are not limited to) dynamiccontent generation interactive queries personalization contentverification and access control In many of them the data localityof our inline computations contributes to optimize latency and toreduce data transfers outside the data center

ACKNOWLEDGMENTSThis work has been partially funded by the European Union Horizon2020 Framework Programme in the context of the project IOStackSoftware-defined Storage for Big Data (H2020-644182) and by theSpanish Ministry of Science and Innovation (Grant TIN2016-77836-C2-1-R)

132

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

REFERENCES[1] Amazon 2017 Athena httpsawsamazoncomathena (2017)[2] Amazon 2017 Lambda httpsawsamazoncomlambda (2017)[3] Amazon 2017 Redshift Spectrum httpsawsamazoncomredshiftspectrum

(2017)[4] Apache 2012 HBASE Coprocessors httpsblogsapacheorghbaseentry

coprocessor_introduction (2012)[5] Apache 2017 Hadoop OpenStack Support Swift Object Store httpshadoop

apacheorgdocsstablehadoop-openstackindexhtml (2017)[6] Krste Asanovic and D Patterson 2014 Firebox A hardware building block for

2020 warehouse-scale computers In FAST[7] Prosunjit Biswas Farhan Patwa and Ravi Sandhu 2015 Content Level Access

Control for OpenStack Swift Storage In ACM CODASPY 123ndash126[8] Chao Chen Yong Chen and Philip C Roth 2012 Dosas Mitigating the resource

contention in active storage systems In CLUSTER 164ndash172[9] Facebook 2017 Presto Distributed SQL Engine for Big Data httpsprestodbio

(2017)[10] Sadjad Fouladi Riad S Wahby Brennan Shacklett Karthikeyan Vasuki Balasubra-

maniam William Zeng Rahul Bhalerao Anirudh Sivaraman George Porter andKeith Winstein 2017 Encoding Fast and Slow Low-Latency Video ProcessingUsing Thousands of Tiny Threads In NSDI

[11] Peter X Gao Akshay Narayan Sagar Karandikar Joao Carreira Sangjin HanRachit Agarwal Sylvia Ratnasamy and Scott Shenker 2016 Network require-ments for resource disaggregation In OSDI 249ndash264

[12] Google 2009 BigTable Coprocessors httpswwwcscornelleduprojectsladis2009talksdean-keynote-ladis2009pdf (2009)

[13] Google 2017 Cloud Functions httpscloudgooglecomfunctionsdocs (2017)[14] Rauacutel Gracia-Tinedo Pedro Garciacutea-Loacutepez Marc Saacutenchez-Artigas Josep Sampeacute

Yosef Moatti Eran Rom Dalit Naor Ramon Nou Toni Corteacutes Pietro Michiardiand William Oppermann 2016 IOSTACK Software-defined Object StorageIEEE Internet Computing (2016)

[15] Rauacutel Gracia-Tinedo Josep Sampeacute Edgar Zamora Marc Saacutenchez-Artigas PedroGarciacutea-Loacutepez Yosef Moatti and Eran Rom 2017 Crystal Software-DefinedStorage for Multi-Tenant Object Stores In FAST

[16] Rauacutel Gracia-Tinedo Yongchao Tian Josep Sampeacute Hamza Harkous John LentonPedro Garciacutea-Loacutepez Marc Saacutenchez-Artigas and Marko Vukolic 2015 DissectingUbuntuOne Autopsy of a Global-scale Personal Cloud Back-end In IMC 155ndash168

[17] Scott Hendrickson Stephen Sturdevant Tyler Harter VenkateshwaranVenkataramani Andrea C Arpaci-Dusseau and Remzi H Arpaci-Dusseau 2016Serverless computation with OpenLambda In HotCloud

[18] HP 2017 The Machine Our vision for the Future of Computing httpswwwlabshpecomthe-machine (2017)

[19] IBM 2017 OpenWhisk httpsdeveloperibmcomopenwhisk (2017)[20] Eric Jonas Shivaram Venkataraman Ion Stoica and Benjamin Recht 2017

Occupy the Cloud Distributed computing for the 99 arXiv preprintarXiv170204024 (2017)

[21] Microsoft 2017 Azure Functions httpsazuremicrosoftcomen-usservicesfunctions (2017)

[22] Netflix 2016 Evolution of the Netflix Data Pipeline httptechblognetflixcom201602evolution-of-netflix-data-pipelinehtml (2016)

[23] OpenStack 2017 Nova httpsdocsopenstackorgnova (2017)[24] OpenStack 2017 Swift httpsdocsopenstackorgswift (2017)[25] OpenStack 2017 Swift Large Objects httpsdocsopenstackorgswiftlatest

overview_large_objectshtml (2017)[26] Juan Piernas Jarek Nieplocha and Evan J Felix 2007 Evaluation of active storage

strategies for the lustre parallel file system In SC 28[27] Lingjun Qin and Dan Feng 2006 Active Storage Framework for Object-based

Storage Device In AINA 97ndash101[28] Simona Rabinovici-Cohen Ealan Henis John Marberg and Kenneth Nagin 2014

Storlet engine performing computations in cloud storage Technical Report IBMTechnical Report H-0320 (August 2014)

[29] Paul Rad Van Lindberg Jeff Prevost Weining Zhang and Mo Jamshidi 2014ZeroVM secure distributed processing for big data analytics In WAC 1ndash6

[30] Redis 2017 Redis httpsredisio (2017)[31] Erik Riedel Garth A Gibson and Christos Faloutsos 1998 Active Storage for

Large-Scale Data Mining and Multimedia In VLDB 62ndash73[32] Michael T Runde Wesley G Stevens Paul A Wortman and John A Chandy 2012

An active storage framework for object storage devices In MSST 1ndash12[33] Josep Sampeacute 2017 Zion framework httpsgithubcomJosepSampezion

(2017)[34] Josep Sampeacute Pedro Garciacutea-Loacutepez and Marc Saacutenchez-Artigas 2016 Vertigo

Programmable micro-controllers for software-defined object storage In IEEECLOUD 180ndash187

[35] Seung Woo Son Samuel Lang Philip Carns Robert Ross Rajeev Thakur BerkinOzisikyilmaz Prabhat Kumar Wei-Keng Liao and Alok Choudhary 2010 En-abling active storage on parallel IO software stacks In MSST 1ndash12

[36] Zhipeng Tan Yanli Yuan Dan Feng Tian Zhan and Xudong Tu 2011 Imple-mentation of method object in active object-based storage systems In CCIS204ndash211

[37] Ubuntu 2017 Cloud image repository httpscloud-imagesubuntucom (2017)[38] Ubuntu 2017 Docker repository httpshubdockercomr_ubuntu (2017)[39] Irvine University of California 1996 Adult Data Set httparchiveicsuciedu

mldatasetsAdult (1996)[40] Himanshu Vashishtha and Eleni Stroulia 2011 Enhancing query support in

hbase via an extended coprocessors framework In ServiceWave 75ndash87[41] Rajiv Wickremesinghe Jeffrey S Chase and Jeffrey Scott Vitter 2002 Distributed

computing with load-managed active storage In HPDC 13ndash23[42] Yulai Xie Dan Feng Yan Li and Darrell DE Long 2016 Oasis an active storage

framework for object storage platform Future Generation Computer Systems 56(2016) 746ndash758

133

  • Abstract
  • 1 Introduction
  • 2 Related Work
  • 3 Design Overview
    • 31 Interception Software and Metadata Service
    • 32 Computation Layer
      • 4 Implementation
        • 41 Interception Software and Metadata Service
        • 42 Computation Layer
          • 5 Applications
          • 6 Evaluation
            • 61 System Setup
            • 62 Swift Resource Contention
            • 63 Application Setup
            • 64 Application characteristics
            • 65 Results
              • 7 Conclusion
              • Acknowledgments
              • References
Page 10: Data-Driven Serverless Functions for Object Storagefaculty.washington.edu/wlloyd/courses/tcss562... · using AWS Lambda for data-intensive tasks over disaggregated storage [10, 20]

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

001 0015 002 0025 003 0035 004

(a) Measured request times with Swift

001 0015 002 0025 003 0035 004

(b) Measured request times with Swift + Zion

Figure 6 Zion base overhead These boxplots show measuredtimes of requesting 10kB objects in Swift (a) and gettingthe same objects with a no-operation (nop) function troughZion (b)

Zion overhead As another basic experiment we studied the extraoverhead that Zion adds when running a function The overheadis the time needed by the interception middleware to redirect theobject to a compute node take an available container launch afunction inside the Docker and pass the object data through it Forthis experiment we made use of the function listed in Listing 1

This simple function iterates over the data without processingit For measuring the overhead we used a set of 10kB objects Wefirst launched 5 000 plain GET requests to measure the base timeneeded to complete a GET request Then we launched another5 000 GET requests for the same objects but in this case we set upthe function in Listing 1 to respond upon the onAfterGet triggerof the objects

The results are plotted in Fig 6 This figure shows that the Zionrsquosoverhead is 9 ms This time includes the penalty of 5 ms for theaddition of an internal hop to Swift (redirect the data through acompute node) plus 4 ms to determine whether and which functionto run inside a free Docker container in the pool when the requesttouches a compute node

Performance of collocated functions We evaluated here theperformance of our application functions For this evaluation weconducted a stress test for each combination of application objectsize and number of function workers As our storage nodes have 4CPU cores we ran each function on 1 2 3 and 4 function workersrespectively The Docker containers were set up as follows 1 CPUcore and 512MB of RAM per worker The results are depicted inFig 7-10

The first experiments were done collocating the containers inthe storage nodes without using the Zion computation layer Themain objective of this experiment was 1 To show how collocationlimits the number of TPS and 2 To show the performance of Zionrsquosfunctions using 1 CPU core per function worker As a side effectnotice that the latter makes it possible to faithfully generalize our

200 300 400 5000

500

1000

1500

2000

Respon

setime(m

s)

10 kB

1 worker 2 workers 3 workers 4 workers

200 300 400 5000

500

1000

1500

2000

100kB

40 60 80 100

TPS

0

500

1000

1500

Respon

setime(m

s)

1MB

5 10 15 20 25 30

TPS

0

1000

2000

3000

4000

10MB

Figure 7 Compression function performance

20 40 60 80 1000

500

1000

1500

2000

1 MB

5 10 15 20

Transactions per Second (TPS)

0

1000

2000

3000

Respon

setime(m

s)

10 MB

2 4 6 8 10

Transactions per Second (TPS)

0

2000

4000

6000

8000

100 MB

0 100 200 300 400 5000

500

1000

1500

2000

Respon

setime(m

s)

100 kB

1 worker 2 workers 3 workers 4 workers

Figure 8 Content-level access control function perfor-mance

results to any number of cores We empirically verified this for theimage resize use case As the cluster is made up of 6 storage nodesthe function with 1 worker was able to process around 9 images persecond for the 400kB image case and around 54 images per secondin the entire cluster For the rest of applications we utilized onlyone storage node The reason is that to tap into the performance ofthe entire cluster we would have to replicate 6 times each object toensure that a copy of it is available at every storage node In activestorage the total computation power for an object is proportionalto the number of replicas it has

Notice that in Fig 7- 10 the curves representing the responsetime start to increase when the resources of the Docker containerare overloaded ie when all the CPU cores are at the 100 of theircapacity These experiments also show how the object size is veryimportant In all experiments the higher the object size the lowerthe number of TPS a function can handle To wit for the content-level access control application (Fig 8) the function is unable tohandle more than 1 TPS for the 100MB object irrespective of thenumber of workers Concretely this task takes around 16 secondsto complete The same occurs with the image resizing function We

130

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

10 20 30 40 50 600

1000

2000

3000

4000

Respon

setime(m

s)

90 kB

1 worker 2 workers 3 workers 4 workers

10 20 30 400

500

1000

1500

2000

400 kB

5 10 15 20

Transactions per Second (TPS)

0

1000

2000

3000

4000

Respon

setime(m

s)

800 kB

2 4 6 8 10

Transactions per Second (TPS)

0

1

2

3

4times104

1200 kB

Figure 9 Image resizer function performance

100 150 200 250 3000

500

1000

1500

2000

Respon

setime(m

s)

10 kB

1 worker 2 workers 3 workers 4 workers

50 100 150 200 2500

500

1000

1500

2000

100 kB

20 40 60 80 100 120

Transactions per Second (TPS)

0

500

1000

1500

Respon

setime(m

s)

1 MB

5 10 15 20 25 30

Transactions per Second (TPS)

0

500

1000

1500

2000

10 MB

Figure 10 Signature verification function performance

saw that resizing an image of 1200kB takes around 236 secondsleading to lt 1 TPS

A function that needs more than 1 second to process an objectit can only handle in parallel as many objects as functions workersare running These examples show that function collocation at thestorage nodes would be in any case insufficient to absorb a burst oftransactions for more than one second Also they demonstrate thatit is almost impossible to predict resource consumption ahead oftime because resources depend on the object size Consequentlyresource management for collocated functions should be dynamicwhich is hard to achieve in practice instead of a simple resourcemanagement policy such as one single worker per CPU core

Performance of non-collocated functions As resource con-tention is not an issue with non-collocated functions what is keyhere is to verify that Zionrsquos storage-disaggregated functions areindeed scalable

In order to test how the Zionrsquos compute layer provides betterscalability than the storage nodes we extended the prior evaluationto non-collocated functions Our compute nodes have 24 cores eachone This means that is possible to start up to 24 function workers in

0 1 2 3 4 5 6 7 810

20

30

40

Max

TPS

90 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 85

10

15

20

25

400 kB

0 1 2 3 4 5 6 7 8

Workers

0

2

4

6

8

Max

TPS

800 kB

0 1 2 3 4 5 6 7 8

Workers

-1

0

1

2

3

1200 kB

Figure 11 Image resizer function scalability

0 1 2 3 4 5 6 7 8100

150

200

250

300

Max

TPS

10 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 850

100

150

200

250

100 kB

0 1 2 3 4 5 6 7 8

Workers

0

50

100

150

200

Max

TPS

1 MB

0 1 2 3 4 5 6 7 8

Workers

6

8

10

12

14

10 MB

Figure 12 Signature verification function scalability

each node However to make the comparative fair with the storagenodes we utilized between 1 to 8 function workers

As in the previous experiments we used 1 proxy and 1 storagenode in addition to 1 compute node which is enough to assessthe scalability of Zion The results are shown in Fig 13- 12 In thiscase we recorded the number of maximum transactions per second(Max TPS) that each worker was able to handle with and withoutcollocation

First we can see how in almost all cases with 1 to 4 functionworkers non-collocated functions can handle more transactionsthan the storage nodes This is due to the fact that the computenodes are more powerful than the storage nodes in terms of CPUTherefore the capacity of ingestion is slightly higher Second themaximum TPS that a storage node can handle is always limitedabove by the number of available cores That is spawning moreworkers has no benefit because they will contend for the availableresources This is the reason why the curve for collocated functions(gray line) flattens out beyond 4 workers in all use cases

However non-collocated functions (black line) can continue toprocess transactions by spawning more workers to respond to thedemand By a quick inspection of all the figures the scalability ofnon-collocated functions is practically linear with the number of

131

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

0 1 2 3 4 5 6 7 8200

400

600

800

1000

Max

TPS 10 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 80

200

400

600

800

100 kB

0 1 2 3 4 5 6 7 8

Workers

20

40

60

80

Max

TPS 1 MB

0 1 2 3 4 5 6 7 8

Workers

0

5

10

15

10 MB

Figure 13 Compression function scalability

0 1 2 3 4 5 6 7 80

200

400

600

Max

TPS

100 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 820

40

60

80

1 MB

0 1 2 3 4 5 6 7 8

Workers

0

5

10

Max

TPS

10 MB

0 1 2 3 4 5 6 7 8

Workers

0

1

2

3

100 MB

Figure 14 Content-level access control function scalability

workers This suggests that disaggregating storage and compute is apractical solution to scale out computation in cloud object stores

Interactive data queries Table 2 compares the execution timesfor the same query for Hadoop and Zion (Listing 2) The entries ofthis table were obtained by running 30X each configuration andthen averaging the results

Non-surprisingly the Hadoop configuration that ingests the datafrom the remote Swift service is the one that presents the highestmean execution time This occurred due to network contentionBoth clusters were geographically far apart so communicationwent through some FAST Ethernet links within our institutionrsquosLAN More concretely it took asymp 917 seconds to transfer the entire10GB dataset and asymp 86 seconds to transfer the 1GB dataset

The configuration that read the data from HDFS had better timesspecially for the large datasets However it has the disadvantagethat it requires pre-loading all the data in HDFS which may be notpractical in many cases

For Zion functions the execution times were comparativelysmall despite using at most 24 workers This suggests that bettertimes can be easily attained with more workers This is clearly seenwhen comparing Zion with 12 and 24 workers respectively For

Table 2 Interactive data queries execution times

Configuration File size Chunk size Time

Pig query - Swift 100 MB 10 MB 816sPig query - HDFS 100 MB 10 MB 714sZion - 12 Workers 100 MB 10 MB 0510sZion - 24 Workers 100 GB 10 MB 0348s

Pig query - Swift 1 GB 32 MB 156sPig query - HDFS 1 GB 32 MB 754sZion - 12 Workers 1 GB 32 MB 4183sZion - 24 Workers 1 GB 32 MB 2256s

Pig query - Swift 10 GB 200 MB 985sPig query - HDFS 10 GB 200 MB 946sZion - 12 Workers 10 GB 200 MB 26253sZion - 24 Workers 10 GB 200 MB 13392s

the 10GB dataset and 12 workers it took 2X much more time thanwith 24 workers Also this experiment confirms the scalability ofZion without incurring in resource contention in the storage clusterIf we had executed this interactive query with 24 workers in thestorage nodes the storage cluster CPU would had reached 100 inall storage nodes for 134 seconds leading to resource contention

Listing 2 Evaluated query in Hadoop and Swift + Zion clus-terss e l e c t u s e r _ i d count ( lowast ) t o t a l

where ( r e q _ t = GetContentResponse or r e q _ t = PutContentResponse )

and msg= Reques t done group by u s e r _ i d order by t o t a l DESC l imi t 10

7 CONCLUSIONThis paper presents Zion an innovative data-driven serverlesscomputing middleware for object storage Unlike commercial event-driven serverless models our data-driven functions intercept andoperate over objects as they are readwrite fromto the object storeSince Zion is serverless it overcomes the scalability and resourcecontention problems of active storage without the need to managea server or a runtime environment

By injecting computations in the data pipeline Zion is ideal foruse cases that require synchronous interactions with external usersExamples of these use cases include (but are not limited to) dynamiccontent generation interactive queries personalization contentverification and access control In many of them the data localityof our inline computations contributes to optimize latency and toreduce data transfers outside the data center

ACKNOWLEDGMENTSThis work has been partially funded by the European Union Horizon2020 Framework Programme in the context of the project IOStackSoftware-defined Storage for Big Data (H2020-644182) and by theSpanish Ministry of Science and Innovation (Grant TIN2016-77836-C2-1-R)

132

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

REFERENCES[1] Amazon 2017 Athena httpsawsamazoncomathena (2017)[2] Amazon 2017 Lambda httpsawsamazoncomlambda (2017)[3] Amazon 2017 Redshift Spectrum httpsawsamazoncomredshiftspectrum

(2017)[4] Apache 2012 HBASE Coprocessors httpsblogsapacheorghbaseentry

coprocessor_introduction (2012)[5] Apache 2017 Hadoop OpenStack Support Swift Object Store httpshadoop

apacheorgdocsstablehadoop-openstackindexhtml (2017)[6] Krste Asanovic and D Patterson 2014 Firebox A hardware building block for

2020 warehouse-scale computers In FAST[7] Prosunjit Biswas Farhan Patwa and Ravi Sandhu 2015 Content Level Access

Control for OpenStack Swift Storage In ACM CODASPY 123ndash126[8] Chao Chen Yong Chen and Philip C Roth 2012 Dosas Mitigating the resource

contention in active storage systems In CLUSTER 164ndash172[9] Facebook 2017 Presto Distributed SQL Engine for Big Data httpsprestodbio

(2017)[10] Sadjad Fouladi Riad S Wahby Brennan Shacklett Karthikeyan Vasuki Balasubra-

maniam William Zeng Rahul Bhalerao Anirudh Sivaraman George Porter andKeith Winstein 2017 Encoding Fast and Slow Low-Latency Video ProcessingUsing Thousands of Tiny Threads In NSDI

[11] Peter X Gao Akshay Narayan Sagar Karandikar Joao Carreira Sangjin HanRachit Agarwal Sylvia Ratnasamy and Scott Shenker 2016 Network require-ments for resource disaggregation In OSDI 249ndash264

[12] Google 2009 BigTable Coprocessors httpswwwcscornelleduprojectsladis2009talksdean-keynote-ladis2009pdf (2009)

[13] Google 2017 Cloud Functions httpscloudgooglecomfunctionsdocs (2017)[14] Rauacutel Gracia-Tinedo Pedro Garciacutea-Loacutepez Marc Saacutenchez-Artigas Josep Sampeacute

Yosef Moatti Eran Rom Dalit Naor Ramon Nou Toni Corteacutes Pietro Michiardiand William Oppermann 2016 IOSTACK Software-defined Object StorageIEEE Internet Computing (2016)

[15] Rauacutel Gracia-Tinedo Josep Sampeacute Edgar Zamora Marc Saacutenchez-Artigas PedroGarciacutea-Loacutepez Yosef Moatti and Eran Rom 2017 Crystal Software-DefinedStorage for Multi-Tenant Object Stores In FAST

[16] Rauacutel Gracia-Tinedo Yongchao Tian Josep Sampeacute Hamza Harkous John LentonPedro Garciacutea-Loacutepez Marc Saacutenchez-Artigas and Marko Vukolic 2015 DissectingUbuntuOne Autopsy of a Global-scale Personal Cloud Back-end In IMC 155ndash168

[17] Scott Hendrickson Stephen Sturdevant Tyler Harter VenkateshwaranVenkataramani Andrea C Arpaci-Dusseau and Remzi H Arpaci-Dusseau 2016Serverless computation with OpenLambda In HotCloud

[18] HP 2017 The Machine Our vision for the Future of Computing httpswwwlabshpecomthe-machine (2017)

[19] IBM 2017 OpenWhisk httpsdeveloperibmcomopenwhisk (2017)[20] Eric Jonas Shivaram Venkataraman Ion Stoica and Benjamin Recht 2017

Occupy the Cloud Distributed computing for the 99 arXiv preprintarXiv170204024 (2017)

[21] Microsoft 2017 Azure Functions httpsazuremicrosoftcomen-usservicesfunctions (2017)

[22] Netflix 2016 Evolution of the Netflix Data Pipeline httptechblognetflixcom201602evolution-of-netflix-data-pipelinehtml (2016)

[23] OpenStack 2017 Nova httpsdocsopenstackorgnova (2017)[24] OpenStack 2017 Swift httpsdocsopenstackorgswift (2017)[25] OpenStack 2017 Swift Large Objects httpsdocsopenstackorgswiftlatest

overview_large_objectshtml (2017)[26] Juan Piernas Jarek Nieplocha and Evan J Felix 2007 Evaluation of active storage

strategies for the lustre parallel file system In SC 28[27] Lingjun Qin and Dan Feng 2006 Active Storage Framework for Object-based

Storage Device In AINA 97ndash101[28] Simona Rabinovici-Cohen Ealan Henis John Marberg and Kenneth Nagin 2014

Storlet engine performing computations in cloud storage Technical Report IBMTechnical Report H-0320 (August 2014)

[29] Paul Rad Van Lindberg Jeff Prevost Weining Zhang and Mo Jamshidi 2014ZeroVM secure distributed processing for big data analytics In WAC 1ndash6

[30] Redis 2017 Redis httpsredisio (2017)[31] Erik Riedel Garth A Gibson and Christos Faloutsos 1998 Active Storage for

Large-Scale Data Mining and Multimedia In VLDB 62ndash73[32] Michael T Runde Wesley G Stevens Paul A Wortman and John A Chandy 2012

An active storage framework for object storage devices In MSST 1ndash12[33] Josep Sampeacute 2017 Zion framework httpsgithubcomJosepSampezion

(2017)[34] Josep Sampeacute Pedro Garciacutea-Loacutepez and Marc Saacutenchez-Artigas 2016 Vertigo

Programmable micro-controllers for software-defined object storage In IEEECLOUD 180ndash187

[35] Seung Woo Son Samuel Lang Philip Carns Robert Ross Rajeev Thakur BerkinOzisikyilmaz Prabhat Kumar Wei-Keng Liao and Alok Choudhary 2010 En-abling active storage on parallel IO software stacks In MSST 1ndash12

[36] Zhipeng Tan Yanli Yuan Dan Feng Tian Zhan and Xudong Tu 2011 Imple-mentation of method object in active object-based storage systems In CCIS204ndash211

[37] Ubuntu 2017 Cloud image repository httpscloud-imagesubuntucom (2017)[38] Ubuntu 2017 Docker repository httpshubdockercomr_ubuntu (2017)[39] Irvine University of California 1996 Adult Data Set httparchiveicsuciedu

mldatasetsAdult (1996)[40] Himanshu Vashishtha and Eleni Stroulia 2011 Enhancing query support in

hbase via an extended coprocessors framework In ServiceWave 75ndash87[41] Rajiv Wickremesinghe Jeffrey S Chase and Jeffrey Scott Vitter 2002 Distributed

computing with load-managed active storage In HPDC 13ndash23[42] Yulai Xie Dan Feng Yan Li and Darrell DE Long 2016 Oasis an active storage

framework for object storage platform Future Generation Computer Systems 56(2016) 746ndash758

133

  • Abstract
  • 1 Introduction
  • 2 Related Work
  • 3 Design Overview
    • 31 Interception Software and Metadata Service
    • 32 Computation Layer
      • 4 Implementation
        • 41 Interception Software and Metadata Service
        • 42 Computation Layer
          • 5 Applications
          • 6 Evaluation
            • 61 System Setup
            • 62 Swift Resource Contention
            • 63 Application Setup
            • 64 Application characteristics
            • 65 Results
              • 7 Conclusion
              • Acknowledgments
              • References
Page 11: Data-Driven Serverless Functions for Object Storagefaculty.washington.edu/wlloyd/courses/tcss562... · using AWS Lambda for data-intensive tasks over disaggregated storage [10, 20]

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

10 20 30 40 50 600

1000

2000

3000

4000

Respon

setime(m

s)

90 kB

1 worker 2 workers 3 workers 4 workers

10 20 30 400

500

1000

1500

2000

400 kB

5 10 15 20

Transactions per Second (TPS)

0

1000

2000

3000

4000

Respon

setime(m

s)

800 kB

2 4 6 8 10

Transactions per Second (TPS)

0

1

2

3

4times104

1200 kB

Figure 9 Image resizer function performance

100 150 200 250 3000

500

1000

1500

2000

Respon

setime(m

s)

10 kB

1 worker 2 workers 3 workers 4 workers

50 100 150 200 2500

500

1000

1500

2000

100 kB

20 40 60 80 100 120

Transactions per Second (TPS)

0

500

1000

1500

Respon

setime(m

s)

1 MB

5 10 15 20 25 30

Transactions per Second (TPS)

0

500

1000

1500

2000

10 MB

Figure 10 Signature verification function performance

saw that resizing an image of 1200kB takes around 236 secondsleading to lt 1 TPS

A function that needs more than 1 second to process an objectit can only handle in parallel as many objects as functions workersare running These examples show that function collocation at thestorage nodes would be in any case insufficient to absorb a burst oftransactions for more than one second Also they demonstrate thatit is almost impossible to predict resource consumption ahead oftime because resources depend on the object size Consequentlyresource management for collocated functions should be dynamicwhich is hard to achieve in practice instead of a simple resourcemanagement policy such as one single worker per CPU core

Performance of non-collocated functions As resource con-tention is not an issue with non-collocated functions what is keyhere is to verify that Zionrsquos storage-disaggregated functions areindeed scalable

In order to test how the Zionrsquos compute layer provides betterscalability than the storage nodes we extended the prior evaluationto non-collocated functions Our compute nodes have 24 cores eachone This means that is possible to start up to 24 function workers in

0 1 2 3 4 5 6 7 810

20

30

40

Max

TPS

90 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 85

10

15

20

25

400 kB

0 1 2 3 4 5 6 7 8

Workers

0

2

4

6

8

Max

TPS

800 kB

0 1 2 3 4 5 6 7 8

Workers

-1

0

1

2

3

1200 kB

Figure 11 Image resizer function scalability

0 1 2 3 4 5 6 7 8100

150

200

250

300

Max

TPS

10 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 850

100

150

200

250

100 kB

0 1 2 3 4 5 6 7 8

Workers

0

50

100

150

200

Max

TPS

1 MB

0 1 2 3 4 5 6 7 8

Workers

6

8

10

12

14

10 MB

Figure 12 Signature verification function scalability

each node However to make the comparative fair with the storagenodes we utilized between 1 to 8 function workers

As in the previous experiments we used 1 proxy and 1 storagenode in addition to 1 compute node which is enough to assessthe scalability of Zion The results are shown in Fig 13- 12 In thiscase we recorded the number of maximum transactions per second(Max TPS) that each worker was able to handle with and withoutcollocation

First we can see how in almost all cases with 1 to 4 functionworkers non-collocated functions can handle more transactionsthan the storage nodes This is due to the fact that the computenodes are more powerful than the storage nodes in terms of CPUTherefore the capacity of ingestion is slightly higher Second themaximum TPS that a storage node can handle is always limitedabove by the number of available cores That is spawning moreworkers has no benefit because they will contend for the availableresources This is the reason why the curve for collocated functions(gray line) flattens out beyond 4 workers in all use cases

However non-collocated functions (black line) can continue toprocess transactions by spawning more workers to respond to thedemand By a quick inspection of all the figures the scalability ofnon-collocated functions is practically linear with the number of

131

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

0 1 2 3 4 5 6 7 8200

400

600

800

1000

Max

TPS 10 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 80

200

400

600

800

100 kB

0 1 2 3 4 5 6 7 8

Workers

20

40

60

80

Max

TPS 1 MB

0 1 2 3 4 5 6 7 8

Workers

0

5

10

15

10 MB

Figure 13 Compression function scalability

0 1 2 3 4 5 6 7 80

200

400

600

Max

TPS

100 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 820

40

60

80

1 MB

0 1 2 3 4 5 6 7 8

Workers

0

5

10

Max

TPS

10 MB

0 1 2 3 4 5 6 7 8

Workers

0

1

2

3

100 MB

Figure 14 Content-level access control function scalability

workers This suggests that disaggregating storage and compute is apractical solution to scale out computation in cloud object stores

Interactive data queries Table 2 compares the execution timesfor the same query for Hadoop and Zion (Listing 2) The entries ofthis table were obtained by running 30X each configuration andthen averaging the results

Non-surprisingly the Hadoop configuration that ingests the datafrom the remote Swift service is the one that presents the highestmean execution time This occurred due to network contentionBoth clusters were geographically far apart so communicationwent through some FAST Ethernet links within our institutionrsquosLAN More concretely it took asymp 917 seconds to transfer the entire10GB dataset and asymp 86 seconds to transfer the 1GB dataset

The configuration that read the data from HDFS had better timesspecially for the large datasets However it has the disadvantagethat it requires pre-loading all the data in HDFS which may be notpractical in many cases

For Zion functions the execution times were comparativelysmall despite using at most 24 workers This suggests that bettertimes can be easily attained with more workers This is clearly seenwhen comparing Zion with 12 and 24 workers respectively For

Table 2 Interactive data queries execution times

Configuration File size Chunk size Time

Pig query - Swift 100 MB 10 MB 816sPig query - HDFS 100 MB 10 MB 714sZion - 12 Workers 100 MB 10 MB 0510sZion - 24 Workers 100 GB 10 MB 0348s

Pig query - Swift 1 GB 32 MB 156sPig query - HDFS 1 GB 32 MB 754sZion - 12 Workers 1 GB 32 MB 4183sZion - 24 Workers 1 GB 32 MB 2256s

Pig query - Swift 10 GB 200 MB 985sPig query - HDFS 10 GB 200 MB 946sZion - 12 Workers 10 GB 200 MB 26253sZion - 24 Workers 10 GB 200 MB 13392s

the 10GB dataset and 12 workers it took 2X much more time thanwith 24 workers Also this experiment confirms the scalability ofZion without incurring in resource contention in the storage clusterIf we had executed this interactive query with 24 workers in thestorage nodes the storage cluster CPU would had reached 100 inall storage nodes for 134 seconds leading to resource contention

Listing 2 Evaluated query in Hadoop and Swift + Zion clus-terss e l e c t u s e r _ i d count ( lowast ) t o t a l

where ( r e q _ t = GetContentResponse or r e q _ t = PutContentResponse )

and msg= Reques t done group by u s e r _ i d order by t o t a l DESC l imi t 10

7 CONCLUSIONThis paper presents Zion an innovative data-driven serverlesscomputing middleware for object storage Unlike commercial event-driven serverless models our data-driven functions intercept andoperate over objects as they are readwrite fromto the object storeSince Zion is serverless it overcomes the scalability and resourcecontention problems of active storage without the need to managea server or a runtime environment

By injecting computations in the data pipeline Zion is ideal foruse cases that require synchronous interactions with external usersExamples of these use cases include (but are not limited to) dynamiccontent generation interactive queries personalization contentverification and access control In many of them the data localityof our inline computations contributes to optimize latency and toreduce data transfers outside the data center

ACKNOWLEDGMENTSThis work has been partially funded by the European Union Horizon2020 Framework Programme in the context of the project IOStackSoftware-defined Storage for Big Data (H2020-644182) and by theSpanish Ministry of Science and Innovation (Grant TIN2016-77836-C2-1-R)

132

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

REFERENCES[1] Amazon 2017 Athena httpsawsamazoncomathena (2017)[2] Amazon 2017 Lambda httpsawsamazoncomlambda (2017)[3] Amazon 2017 Redshift Spectrum httpsawsamazoncomredshiftspectrum

(2017)[4] Apache 2012 HBASE Coprocessors httpsblogsapacheorghbaseentry

coprocessor_introduction (2012)[5] Apache 2017 Hadoop OpenStack Support Swift Object Store httpshadoop

apacheorgdocsstablehadoop-openstackindexhtml (2017)[6] Krste Asanovic and D Patterson 2014 Firebox A hardware building block for

2020 warehouse-scale computers In FAST[7] Prosunjit Biswas Farhan Patwa and Ravi Sandhu 2015 Content Level Access

Control for OpenStack Swift Storage In ACM CODASPY 123ndash126[8] Chao Chen Yong Chen and Philip C Roth 2012 Dosas Mitigating the resource

contention in active storage systems In CLUSTER 164ndash172[9] Facebook 2017 Presto Distributed SQL Engine for Big Data httpsprestodbio

(2017)[10] Sadjad Fouladi Riad S Wahby Brennan Shacklett Karthikeyan Vasuki Balasubra-

maniam William Zeng Rahul Bhalerao Anirudh Sivaraman George Porter andKeith Winstein 2017 Encoding Fast and Slow Low-Latency Video ProcessingUsing Thousands of Tiny Threads In NSDI

[11] Peter X Gao Akshay Narayan Sagar Karandikar Joao Carreira Sangjin HanRachit Agarwal Sylvia Ratnasamy and Scott Shenker 2016 Network require-ments for resource disaggregation In OSDI 249ndash264

[12] Google 2009 BigTable Coprocessors httpswwwcscornelleduprojectsladis2009talksdean-keynote-ladis2009pdf (2009)

[13] Google 2017 Cloud Functions httpscloudgooglecomfunctionsdocs (2017)[14] Rauacutel Gracia-Tinedo Pedro Garciacutea-Loacutepez Marc Saacutenchez-Artigas Josep Sampeacute

Yosef Moatti Eran Rom Dalit Naor Ramon Nou Toni Corteacutes Pietro Michiardiand William Oppermann 2016 IOSTACK Software-defined Object StorageIEEE Internet Computing (2016)

[15] Rauacutel Gracia-Tinedo Josep Sampeacute Edgar Zamora Marc Saacutenchez-Artigas PedroGarciacutea-Loacutepez Yosef Moatti and Eran Rom 2017 Crystal Software-DefinedStorage for Multi-Tenant Object Stores In FAST

[16] Rauacutel Gracia-Tinedo Yongchao Tian Josep Sampeacute Hamza Harkous John LentonPedro Garciacutea-Loacutepez Marc Saacutenchez-Artigas and Marko Vukolic 2015 DissectingUbuntuOne Autopsy of a Global-scale Personal Cloud Back-end In IMC 155ndash168

[17] Scott Hendrickson Stephen Sturdevant Tyler Harter VenkateshwaranVenkataramani Andrea C Arpaci-Dusseau and Remzi H Arpaci-Dusseau 2016Serverless computation with OpenLambda In HotCloud

[18] HP 2017 The Machine Our vision for the Future of Computing httpswwwlabshpecomthe-machine (2017)

[19] IBM 2017 OpenWhisk httpsdeveloperibmcomopenwhisk (2017)[20] Eric Jonas Shivaram Venkataraman Ion Stoica and Benjamin Recht 2017

Occupy the Cloud Distributed computing for the 99 arXiv preprintarXiv170204024 (2017)

[21] Microsoft 2017 Azure Functions httpsazuremicrosoftcomen-usservicesfunctions (2017)

[22] Netflix 2016 Evolution of the Netflix Data Pipeline httptechblognetflixcom201602evolution-of-netflix-data-pipelinehtml (2016)

[23] OpenStack 2017 Nova httpsdocsopenstackorgnova (2017)[24] OpenStack 2017 Swift httpsdocsopenstackorgswift (2017)[25] OpenStack 2017 Swift Large Objects httpsdocsopenstackorgswiftlatest

overview_large_objectshtml (2017)[26] Juan Piernas Jarek Nieplocha and Evan J Felix 2007 Evaluation of active storage

strategies for the lustre parallel file system In SC 28[27] Lingjun Qin and Dan Feng 2006 Active Storage Framework for Object-based

Storage Device In AINA 97ndash101[28] Simona Rabinovici-Cohen Ealan Henis John Marberg and Kenneth Nagin 2014

Storlet engine performing computations in cloud storage Technical Report IBMTechnical Report H-0320 (August 2014)

[29] Paul Rad Van Lindberg Jeff Prevost Weining Zhang and Mo Jamshidi 2014ZeroVM secure distributed processing for big data analytics In WAC 1ndash6

[30] Redis 2017 Redis httpsredisio (2017)[31] Erik Riedel Garth A Gibson and Christos Faloutsos 1998 Active Storage for

Large-Scale Data Mining and Multimedia In VLDB 62ndash73[32] Michael T Runde Wesley G Stevens Paul A Wortman and John A Chandy 2012

An active storage framework for object storage devices In MSST 1ndash12[33] Josep Sampeacute 2017 Zion framework httpsgithubcomJosepSampezion

(2017)[34] Josep Sampeacute Pedro Garciacutea-Loacutepez and Marc Saacutenchez-Artigas 2016 Vertigo

Programmable micro-controllers for software-defined object storage In IEEECLOUD 180ndash187

[35] Seung Woo Son Samuel Lang Philip Carns Robert Ross Rajeev Thakur BerkinOzisikyilmaz Prabhat Kumar Wei-Keng Liao and Alok Choudhary 2010 En-abling active storage on parallel IO software stacks In MSST 1ndash12

[36] Zhipeng Tan Yanli Yuan Dan Feng Tian Zhan and Xudong Tu 2011 Imple-mentation of method object in active object-based storage systems In CCIS204ndash211

[37] Ubuntu 2017 Cloud image repository httpscloud-imagesubuntucom (2017)[38] Ubuntu 2017 Docker repository httpshubdockercomr_ubuntu (2017)[39] Irvine University of California 1996 Adult Data Set httparchiveicsuciedu

mldatasetsAdult (1996)[40] Himanshu Vashishtha and Eleni Stroulia 2011 Enhancing query support in

hbase via an extended coprocessors framework In ServiceWave 75ndash87[41] Rajiv Wickremesinghe Jeffrey S Chase and Jeffrey Scott Vitter 2002 Distributed

computing with load-managed active storage In HPDC 13ndash23[42] Yulai Xie Dan Feng Yan Li and Darrell DE Long 2016 Oasis an active storage

framework for object storage platform Future Generation Computer Systems 56(2016) 746ndash758

133

  • Abstract
  • 1 Introduction
  • 2 Related Work
  • 3 Design Overview
    • 31 Interception Software and Metadata Service
    • 32 Computation Layer
      • 4 Implementation
        • 41 Interception Software and Metadata Service
        • 42 Computation Layer
          • 5 Applications
          • 6 Evaluation
            • 61 System Setup
            • 62 Swift Resource Contention
            • 63 Application Setup
            • 64 Application characteristics
            • 65 Results
              • 7 Conclusion
              • Acknowledgments
              • References
Page 12: Data-Driven Serverless Functions for Object Storagefaculty.washington.edu/wlloyd/courses/tcss562... · using AWS Lambda for data-intensive tasks over disaggregated storage [10, 20]

Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA Josep Sampeacute Marc Saacutenchez-Artigas Pedro Garciacutea-Loacutepez Gerard Pariacutes

0 1 2 3 4 5 6 7 8200

400

600

800

1000

Max

TPS 10 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 80

200

400

600

800

100 kB

0 1 2 3 4 5 6 7 8

Workers

20

40

60

80

Max

TPS 1 MB

0 1 2 3 4 5 6 7 8

Workers

0

5

10

15

10 MB

Figure 13 Compression function scalability

0 1 2 3 4 5 6 7 80

200

400

600

Max

TPS

100 kB

Storage nodes Zion compute layer

0 1 2 3 4 5 6 7 820

40

60

80

1 MB

0 1 2 3 4 5 6 7 8

Workers

0

5

10

Max

TPS

10 MB

0 1 2 3 4 5 6 7 8

Workers

0

1

2

3

100 MB

Figure 14 Content-level access control function scalability

workers This suggests that disaggregating storage and compute is apractical solution to scale out computation in cloud object stores

Interactive data queries Table 2 compares the execution timesfor the same query for Hadoop and Zion (Listing 2) The entries ofthis table were obtained by running 30X each configuration andthen averaging the results

Non-surprisingly the Hadoop configuration that ingests the datafrom the remote Swift service is the one that presents the highestmean execution time This occurred due to network contentionBoth clusters were geographically far apart so communicationwent through some FAST Ethernet links within our institutionrsquosLAN More concretely it took asymp 917 seconds to transfer the entire10GB dataset and asymp 86 seconds to transfer the 1GB dataset

The configuration that read the data from HDFS had better timesspecially for the large datasets However it has the disadvantagethat it requires pre-loading all the data in HDFS which may be notpractical in many cases

For Zion functions the execution times were comparativelysmall despite using at most 24 workers This suggests that bettertimes can be easily attained with more workers This is clearly seenwhen comparing Zion with 12 and 24 workers respectively For

Table 2 Interactive data queries execution times

Configuration File size Chunk size Time

Pig query - Swift 100 MB 10 MB 816sPig query - HDFS 100 MB 10 MB 714sZion - 12 Workers 100 MB 10 MB 0510sZion - 24 Workers 100 GB 10 MB 0348s

Pig query - Swift 1 GB 32 MB 156sPig query - HDFS 1 GB 32 MB 754sZion - 12 Workers 1 GB 32 MB 4183sZion - 24 Workers 1 GB 32 MB 2256s

Pig query - Swift 10 GB 200 MB 985sPig query - HDFS 10 GB 200 MB 946sZion - 12 Workers 10 GB 200 MB 26253sZion - 24 Workers 10 GB 200 MB 13392s

the 10GB dataset and 12 workers it took 2X much more time thanwith 24 workers Also this experiment confirms the scalability ofZion without incurring in resource contention in the storage clusterIf we had executed this interactive query with 24 workers in thestorage nodes the storage cluster CPU would had reached 100 inall storage nodes for 134 seconds leading to resource contention

Listing 2 Evaluated query in Hadoop and Swift + Zion clus-terss e l e c t u s e r _ i d count ( lowast ) t o t a l

where ( r e q _ t = GetContentResponse or r e q _ t = PutContentResponse )

and msg= Reques t done group by u s e r _ i d order by t o t a l DESC l imi t 10

7 CONCLUSIONThis paper presents Zion an innovative data-driven serverlesscomputing middleware for object storage Unlike commercial event-driven serverless models our data-driven functions intercept andoperate over objects as they are readwrite fromto the object storeSince Zion is serverless it overcomes the scalability and resourcecontention problems of active storage without the need to managea server or a runtime environment

By injecting computations in the data pipeline Zion is ideal foruse cases that require synchronous interactions with external usersExamples of these use cases include (but are not limited to) dynamiccontent generation interactive queries personalization contentverification and access control In many of them the data localityof our inline computations contributes to optimize latency and toreduce data transfers outside the data center

ACKNOWLEDGMENTSThis work has been partially funded by the European Union Horizon2020 Framework Programme in the context of the project IOStackSoftware-defined Storage for Big Data (H2020-644182) and by theSpanish Ministry of Science and Innovation (Grant TIN2016-77836-C2-1-R)

132

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

REFERENCES[1] Amazon 2017 Athena httpsawsamazoncomathena (2017)[2] Amazon 2017 Lambda httpsawsamazoncomlambda (2017)[3] Amazon 2017 Redshift Spectrum httpsawsamazoncomredshiftspectrum

(2017)[4] Apache 2012 HBASE Coprocessors httpsblogsapacheorghbaseentry

coprocessor_introduction (2012)[5] Apache 2017 Hadoop OpenStack Support Swift Object Store httpshadoop

apacheorgdocsstablehadoop-openstackindexhtml (2017)[6] Krste Asanovic and D Patterson 2014 Firebox A hardware building block for

2020 warehouse-scale computers In FAST[7] Prosunjit Biswas Farhan Patwa and Ravi Sandhu 2015 Content Level Access

Control for OpenStack Swift Storage In ACM CODASPY 123ndash126[8] Chao Chen Yong Chen and Philip C Roth 2012 Dosas Mitigating the resource

contention in active storage systems In CLUSTER 164ndash172[9] Facebook 2017 Presto Distributed SQL Engine for Big Data httpsprestodbio

(2017)[10] Sadjad Fouladi Riad S Wahby Brennan Shacklett Karthikeyan Vasuki Balasubra-

maniam William Zeng Rahul Bhalerao Anirudh Sivaraman George Porter andKeith Winstein 2017 Encoding Fast and Slow Low-Latency Video ProcessingUsing Thousands of Tiny Threads In NSDI

[11] Peter X Gao Akshay Narayan Sagar Karandikar Joao Carreira Sangjin HanRachit Agarwal Sylvia Ratnasamy and Scott Shenker 2016 Network require-ments for resource disaggregation In OSDI 249ndash264

[12] Google 2009 BigTable Coprocessors httpswwwcscornelleduprojectsladis2009talksdean-keynote-ladis2009pdf (2009)

[13] Google 2017 Cloud Functions httpscloudgooglecomfunctionsdocs (2017)[14] Rauacutel Gracia-Tinedo Pedro Garciacutea-Loacutepez Marc Saacutenchez-Artigas Josep Sampeacute

Yosef Moatti Eran Rom Dalit Naor Ramon Nou Toni Corteacutes Pietro Michiardiand William Oppermann 2016 IOSTACK Software-defined Object StorageIEEE Internet Computing (2016)

[15] Rauacutel Gracia-Tinedo Josep Sampeacute Edgar Zamora Marc Saacutenchez-Artigas PedroGarciacutea-Loacutepez Yosef Moatti and Eran Rom 2017 Crystal Software-DefinedStorage for Multi-Tenant Object Stores In FAST

[16] Rauacutel Gracia-Tinedo Yongchao Tian Josep Sampeacute Hamza Harkous John LentonPedro Garciacutea-Loacutepez Marc Saacutenchez-Artigas and Marko Vukolic 2015 DissectingUbuntuOne Autopsy of a Global-scale Personal Cloud Back-end In IMC 155ndash168

[17] Scott Hendrickson Stephen Sturdevant Tyler Harter VenkateshwaranVenkataramani Andrea C Arpaci-Dusseau and Remzi H Arpaci-Dusseau 2016Serverless computation with OpenLambda In HotCloud

[18] HP 2017 The Machine Our vision for the Future of Computing httpswwwlabshpecomthe-machine (2017)

[19] IBM 2017 OpenWhisk httpsdeveloperibmcomopenwhisk (2017)[20] Eric Jonas Shivaram Venkataraman Ion Stoica and Benjamin Recht 2017

Occupy the Cloud Distributed computing for the 99 arXiv preprintarXiv170204024 (2017)

[21] Microsoft 2017 Azure Functions httpsazuremicrosoftcomen-usservicesfunctions (2017)

[22] Netflix 2016 Evolution of the Netflix Data Pipeline httptechblognetflixcom201602evolution-of-netflix-data-pipelinehtml (2016)

[23] OpenStack 2017 Nova httpsdocsopenstackorgnova (2017)[24] OpenStack 2017 Swift httpsdocsopenstackorgswift (2017)[25] OpenStack 2017 Swift Large Objects httpsdocsopenstackorgswiftlatest

overview_large_objectshtml (2017)[26] Juan Piernas Jarek Nieplocha and Evan J Felix 2007 Evaluation of active storage

strategies for the lustre parallel file system In SC 28[27] Lingjun Qin and Dan Feng 2006 Active Storage Framework for Object-based

Storage Device In AINA 97ndash101[28] Simona Rabinovici-Cohen Ealan Henis John Marberg and Kenneth Nagin 2014

Storlet engine performing computations in cloud storage Technical Report IBMTechnical Report H-0320 (August 2014)

[29] Paul Rad Van Lindberg Jeff Prevost Weining Zhang and Mo Jamshidi 2014ZeroVM secure distributed processing for big data analytics In WAC 1ndash6

[30] Redis 2017 Redis httpsredisio (2017)[31] Erik Riedel Garth A Gibson and Christos Faloutsos 1998 Active Storage for

Large-Scale Data Mining and Multimedia In VLDB 62ndash73[32] Michael T Runde Wesley G Stevens Paul A Wortman and John A Chandy 2012

An active storage framework for object storage devices In MSST 1ndash12[33] Josep Sampeacute 2017 Zion framework httpsgithubcomJosepSampezion

(2017)[34] Josep Sampeacute Pedro Garciacutea-Loacutepez and Marc Saacutenchez-Artigas 2016 Vertigo

Programmable micro-controllers for software-defined object storage In IEEECLOUD 180ndash187

[35] Seung Woo Son Samuel Lang Philip Carns Robert Ross Rajeev Thakur BerkinOzisikyilmaz Prabhat Kumar Wei-Keng Liao and Alok Choudhary 2010 En-abling active storage on parallel IO software stacks In MSST 1ndash12

[36] Zhipeng Tan Yanli Yuan Dan Feng Tian Zhan and Xudong Tu 2011 Imple-mentation of method object in active object-based storage systems In CCIS204ndash211

[37] Ubuntu 2017 Cloud image repository httpscloud-imagesubuntucom (2017)[38] Ubuntu 2017 Docker repository httpshubdockercomr_ubuntu (2017)[39] Irvine University of California 1996 Adult Data Set httparchiveicsuciedu

mldatasetsAdult (1996)[40] Himanshu Vashishtha and Eleni Stroulia 2011 Enhancing query support in

hbase via an extended coprocessors framework In ServiceWave 75ndash87[41] Rajiv Wickremesinghe Jeffrey S Chase and Jeffrey Scott Vitter 2002 Distributed

computing with load-managed active storage In HPDC 13ndash23[42] Yulai Xie Dan Feng Yan Li and Darrell DE Long 2016 Oasis an active storage

framework for object storage platform Future Generation Computer Systems 56(2016) 746ndash758

133

  • Abstract
  • 1 Introduction
  • 2 Related Work
  • 3 Design Overview
    • 31 Interception Software and Metadata Service
    • 32 Computation Layer
      • 4 Implementation
        • 41 Interception Software and Metadata Service
        • 42 Computation Layer
          • 5 Applications
          • 6 Evaluation
            • 61 System Setup
            • 62 Swift Resource Contention
            • 63 Application Setup
            • 64 Application characteristics
            • 65 Results
              • 7 Conclusion
              • Acknowledgments
              • References
Page 13: Data-Driven Serverless Functions for Object Storagefaculty.washington.edu/wlloyd/courses/tcss562... · using AWS Lambda for data-intensive tasks over disaggregated storage [10, 20]

Data-Driven Serverless Functions for Object Storage Middleware rsquo17 December 11ndash15 2017 Las Vegas NV USA

REFERENCES[1] Amazon 2017 Athena httpsawsamazoncomathena (2017)[2] Amazon 2017 Lambda httpsawsamazoncomlambda (2017)[3] Amazon 2017 Redshift Spectrum httpsawsamazoncomredshiftspectrum

(2017)[4] Apache 2012 HBASE Coprocessors httpsblogsapacheorghbaseentry

coprocessor_introduction (2012)[5] Apache 2017 Hadoop OpenStack Support Swift Object Store httpshadoop

apacheorgdocsstablehadoop-openstackindexhtml (2017)[6] Krste Asanovic and D Patterson 2014 Firebox A hardware building block for

2020 warehouse-scale computers In FAST[7] Prosunjit Biswas Farhan Patwa and Ravi Sandhu 2015 Content Level Access

Control for OpenStack Swift Storage In ACM CODASPY 123ndash126[8] Chao Chen Yong Chen and Philip C Roth 2012 Dosas Mitigating the resource

contention in active storage systems In CLUSTER 164ndash172[9] Facebook 2017 Presto Distributed SQL Engine for Big Data httpsprestodbio

(2017)[10] Sadjad Fouladi Riad S Wahby Brennan Shacklett Karthikeyan Vasuki Balasubra-

maniam William Zeng Rahul Bhalerao Anirudh Sivaraman George Porter andKeith Winstein 2017 Encoding Fast and Slow Low-Latency Video ProcessingUsing Thousands of Tiny Threads In NSDI

[11] Peter X Gao Akshay Narayan Sagar Karandikar Joao Carreira Sangjin HanRachit Agarwal Sylvia Ratnasamy and Scott Shenker 2016 Network require-ments for resource disaggregation In OSDI 249ndash264

[12] Google 2009 BigTable Coprocessors httpswwwcscornelleduprojectsladis2009talksdean-keynote-ladis2009pdf (2009)

[13] Google 2017 Cloud Functions httpscloudgooglecomfunctionsdocs (2017)[14] Rauacutel Gracia-Tinedo Pedro Garciacutea-Loacutepez Marc Saacutenchez-Artigas Josep Sampeacute

Yosef Moatti Eran Rom Dalit Naor Ramon Nou Toni Corteacutes Pietro Michiardiand William Oppermann 2016 IOSTACK Software-defined Object StorageIEEE Internet Computing (2016)

[15] Rauacutel Gracia-Tinedo Josep Sampeacute Edgar Zamora Marc Saacutenchez-Artigas PedroGarciacutea-Loacutepez Yosef Moatti and Eran Rom 2017 Crystal Software-DefinedStorage for Multi-Tenant Object Stores In FAST

[16] Rauacutel Gracia-Tinedo Yongchao Tian Josep Sampeacute Hamza Harkous John LentonPedro Garciacutea-Loacutepez Marc Saacutenchez-Artigas and Marko Vukolic 2015 DissectingUbuntuOne Autopsy of a Global-scale Personal Cloud Back-end In IMC 155ndash168

[17] Scott Hendrickson Stephen Sturdevant Tyler Harter VenkateshwaranVenkataramani Andrea C Arpaci-Dusseau and Remzi H Arpaci-Dusseau 2016Serverless computation with OpenLambda In HotCloud

[18] HP 2017 The Machine Our vision for the Future of Computing httpswwwlabshpecomthe-machine (2017)

[19] IBM 2017 OpenWhisk httpsdeveloperibmcomopenwhisk (2017)[20] Eric Jonas Shivaram Venkataraman Ion Stoica and Benjamin Recht 2017

Occupy the Cloud Distributed computing for the 99 arXiv preprintarXiv170204024 (2017)

[21] Microsoft 2017 Azure Functions httpsazuremicrosoftcomen-usservicesfunctions (2017)

[22] Netflix 2016 Evolution of the Netflix Data Pipeline httptechblognetflixcom201602evolution-of-netflix-data-pipelinehtml (2016)

[23] OpenStack 2017 Nova httpsdocsopenstackorgnova (2017)[24] OpenStack 2017 Swift httpsdocsopenstackorgswift (2017)[25] OpenStack 2017 Swift Large Objects httpsdocsopenstackorgswiftlatest

overview_large_objectshtml (2017)[26] Juan Piernas Jarek Nieplocha and Evan J Felix 2007 Evaluation of active storage

strategies for the lustre parallel file system In SC 28[27] Lingjun Qin and Dan Feng 2006 Active Storage Framework for Object-based

Storage Device In AINA 97ndash101[28] Simona Rabinovici-Cohen Ealan Henis John Marberg and Kenneth Nagin 2014

Storlet engine performing computations in cloud storage Technical Report IBMTechnical Report H-0320 (August 2014)

[29] Paul Rad Van Lindberg Jeff Prevost Weining Zhang and Mo Jamshidi 2014ZeroVM secure distributed processing for big data analytics In WAC 1ndash6

[30] Redis 2017 Redis httpsredisio (2017)[31] Erik Riedel Garth A Gibson and Christos Faloutsos 1998 Active Storage for

Large-Scale Data Mining and Multimedia In VLDB 62ndash73[32] Michael T Runde Wesley G Stevens Paul A Wortman and John A Chandy 2012

An active storage framework for object storage devices In MSST 1ndash12[33] Josep Sampeacute 2017 Zion framework httpsgithubcomJosepSampezion

(2017)[34] Josep Sampeacute Pedro Garciacutea-Loacutepez and Marc Saacutenchez-Artigas 2016 Vertigo

Programmable micro-controllers for software-defined object storage In IEEECLOUD 180ndash187

[35] Seung Woo Son Samuel Lang Philip Carns Robert Ross Rajeev Thakur BerkinOzisikyilmaz Prabhat Kumar Wei-Keng Liao and Alok Choudhary 2010 En-abling active storage on parallel IO software stacks In MSST 1ndash12

[36] Zhipeng Tan Yanli Yuan Dan Feng Tian Zhan and Xudong Tu 2011 Imple-mentation of method object in active object-based storage systems In CCIS204ndash211

[37] Ubuntu 2017 Cloud image repository httpscloud-imagesubuntucom (2017)[38] Ubuntu 2017 Docker repository httpshubdockercomr_ubuntu (2017)[39] Irvine University of California 1996 Adult Data Set httparchiveicsuciedu

mldatasetsAdult (1996)[40] Himanshu Vashishtha and Eleni Stroulia 2011 Enhancing query support in

hbase via an extended coprocessors framework In ServiceWave 75ndash87[41] Rajiv Wickremesinghe Jeffrey S Chase and Jeffrey Scott Vitter 2002 Distributed

computing with load-managed active storage In HPDC 13ndash23[42] Yulai Xie Dan Feng Yan Li and Darrell DE Long 2016 Oasis an active storage

framework for object storage platform Future Generation Computer Systems 56(2016) 746ndash758

133

  • Abstract
  • 1 Introduction
  • 2 Related Work
  • 3 Design Overview
    • 31 Interception Software and Metadata Service
    • 32 Computation Layer
      • 4 Implementation
        • 41 Interception Software and Metadata Service
        • 42 Computation Layer
          • 5 Applications
          • 6 Evaluation
            • 61 System Setup
            • 62 Swift Resource Contention
            • 63 Application Setup
            • 64 Application characteristics
            • 65 Results
              • 7 Conclusion
              • Acknowledgments
              • References