building large scale, job processing systems with scala akka actor framework
DESCRIPTION
The Akka Actor framework is designed to be a fast message processing system. In this talk, we will explain how, at Box, we have used this framework to develop a large scale job processing system that works on billions of data files and achieves a high degree of throughput and fault tolerance. Over the course of the talk, we will explore the usage of Akka framework’s Supervisor functionality to provide a more controllable fault-tolerance strategy, and how we can use Futures to manage asynchronous jobs.TRANSCRIPT
Building massive scale, fault tolerant,
job processing systems with Scala Akka
frameworkVignesh Sukumar
SVCC 2012
About me
• Storage group, Backend Engineering at Box• Love enterprise software! • Interested in Big Data and building distributed
systems in the cloud
About Box
• Leader in enterprise cloud collaboration and storage
• Cutting-edge work in backend, frontend, platform and engineering services
• A really fun place to work – we have a long slide!
Talk outline• Job processing requirements• Traditional & new models for job processing
• Akka actors framework• Achieving and controlling high IO throughput• Fine-grained fault tolerance
Typical architecture in a cloud storage environment
Practical realities
•Storage nodes are usually of varying configurations (OS, processing power, storage capacity, etc) mainly because of rapid evolution in provisioning operations•Some nodes are more over-worked than the others (for ex, accepting live uploads)•Billions of files; petabytes
Job processing requirements
• Iterate over all files (billions, petabyte scale): for ex, check consistency of all files
• High throughput
• Fault tolerant
• Secure
Traditional job processing model
Why traditional models fail in cloud storage environments
• Not scalable: petabyte scale, billions of files• Insecure: cannot move files out of storage
nodes• No performance control: easy to overwhelm
any storage node• No fine grained fault tolerance
Compute on Storage
• Move job computation directly to storage nodes
• Utilize abundant CPU on storage nodes• Metadata store still stays in a highly available
system like a RDBMS • Results from operations on a file are
completely independent
Master – slave architecture
Benefits
• High IO throughput: Direct access; no transfer of files over a network
• Secure: files do not leave storage nodes• Better performance control: compute can
easily monitor system load and back off• Better fault tolerance handling: finer grained
handling of errors
Master node
• Responsible for accepting job submissions and splitting them to tasks for slave nodes
• Stateful: keeps durable copy of jobs and tasks in Zookeeper
• Horizontally scalable: service can be run on multiple nodes
Agent
• Runs directly on the storage nodes on a machine-independent JVM container
• Stateless: no task state is maintained• Monitors system load with back-off• Reports results directly to master without
synchronizing with other agents
Implementation with thethe Scala Akka Actor
framework
Actors
• Concurrent threads abstraction with no shared state
• Exchange messages• Asynchronous, non-blocking• Multiple actors can map to a single OS thread• Parent-children hierarchical relationship
Actors and messages• Class MyActor extends Actor { def receive = { case MsgType1 => // do something }}
// instantiation and sending messages val actorRef = system.actorOf(Props(new MyActor))actorRef ! MsgType1
Agent Actor System
Achieving high IO throughput• Parallel, asynchronous IO through “Futures” val fileIOResult = Future { // issue high latency tasks like file IO } val networkIOResult = Future { // read from network }
Futures.awaitAll(<wait time>, fileIOResult, networkIOResult)fileIOResult onSuccess { // do something } networkIOResult onFailure { // retry }
Controlling system throughput
• The problem: agents need to throttle themselves as storage nodes serve live traffic
• Adjust number of parallel workers dynamically through a monitoring service
Controlling throughput: Examples
•Parallelism parameters can be gotten from a separate configuration service on a per node basis•Some machines can be speeded up and others slowed down this way•The configuration can be updated on a cron schedule to speed up during weekends
Fine grained fault tolerance with Supervisors
• Parents of child actors can define specific fault-handling strategies for each failure scenario in their children
• Components can fail gracefully without affecting the entire system
Supervision strategy: Examples
Class TaskActor extends Actor { // create child workers override val supervisorStrategy = OneForOneStrategy(maxNrOrRetries = 3) { case SqlException => Resume // retry the same file case FileCorruptionException => Stop // don’t clobber it! case IOException => Restart // report and move on}
Unit testing
• Scalatra test framework: very easy to read! TaskActorTest.receive(BadFileMsg) must throw
FileNotFoundException• Mocks for network and database calls val mockHttp = mock[HttpExecutor] TaskActorTest ! doHttpPost there was atLeastOne(mockHttp).POST
• Extensive testing of failure injection scenarios
Takeaways• Keep your architecture simple by modeling
actor message flow along the same paths as parent-child actor hierarchy (i.e., no message exchange between peer child actors)
• Design and implement for component failures• Write unit tests extensively: we did not have
any fundamental level functionality breakage• Box Engineering is awesome!