web based fault tolerance prepared by: shah muhammad hamdi krishanu sarker sairaj bharath...

Web Based Fault Tolerance Prepared By: Shah Muhammad Hamdi Krishanu Sarker SaiRaj Bharath Yalamanchili Xueli Xiao Heta P. Desai Advanced Operating System Dr. Yanqing Zhang November 11, 2015 Outline Fault Tolerance in Web Overview of Byzantine Fault Discussion on the paper Application Aware Byzantine Fault Tolerance References 2 Fault Tolerance in Web Fault tolerance viewed from two sides Machines, i.e. web servers Processes, i.e. web services 3 From the View of Machines FT can be achieved through Client side caching Server replication Redundancy High availability Example: DNS sending several addresses as the result of a name lookup. 4 DNS supporting Fault Tolerance 5 User DNS Server Server X:Rep 1 Server X:Rep 2 Server X:Rep 3 Server X Addresses of Replica 1, 2 and 3 FT in Two-Tiered and Multi- tiered Architecture Two tired architecture: Clint requests server, server responses Easy to make it fault tolerant Only machine replication is needed Multi-tiered architecture Callers and callee need replications of the invocations. Comparatively difficult 6 From the view of processes Two techniques Process resilience One or more processes can fail without disturbing the system Reliable multicasting All message reach successfully to all the processes 7 K Fault Tolerance: General Failures General failures Crash, omission, response and timing failures. A system is k FT if k+1 copies are kept. If k copies crashes, 1 exists. 8 K Fault Tolerance: Byzantine Failures A system is k FT if 2k+1 copies are kept. In the worst case all k servers can send same wrong answers. Then there are still k + 1 servers with right servers. 9 Difficulty in Handling Byzantine Fault Tolerant Servers in Three Tired architecture Suppose, the application server Byzantine Fault Tolerant (replicated) server Database server Nonreplicated server 10 11 Let, k = 2 2 - Byzantine fault tolerant system needs (2*2+1)= 5 replicas Difficulty in Handling Byzantine Fault Tolerant Servers in Three Tired architecture 12 Three requirements: 1. Client will see a single application server instead of BFT server needs additional protocol for processing inconsistent answers from the DB server. 3. DB server should also see a single application server instead of 5. Difficulty in Handling Byzantine Fault Tolerant Servers in Three Tired architecture Overview on Byzantine Fault Tolerance (Krishanu Sarker) 13 What is Byzantine Fault? An incorrect operation (algorithm) that occurs in a distributed system that can be classified as: Omission Failure a failure of not being present such as failing to respond to a request or not receiving a request. Execution Failure or Lying a failure due to sending incorrect or inconsistent data, corrupting the local state or responding to a request incorrectly. 14 Examples Round off errors passed from one function to another and then another, etc. Corrupted system databases where the error is not detected Compiler errors An undetected bit flip producing a bad message 15 Byzantine Generals Problem An illustrative example of Byzantine Faults Different armies want to besiege an enemy city Success requires agreement on a common plan amongst the disbursed army Complication: There may be traitors who send out conflicting messages to sow confusion. Challenge: Find an algorithm that ensures the armies come to an agreement and attack at the same time. 16 Real World Relationships Generals -> processors Traitors -> faulty processors or faulty system components (including software) Messengers -> processor communications/system data bus 17 Byzantine Generals Problem Scenario 18 Byzantine Generals Problem Scenario Cont. 19 Problem How do replicated units reach agreement on a non-replicated value in the presence of malicious faults? Simple voting algorithms cannot handle the malicious faults. 20 Introduction to Application-Aware Byzantine Fault Tolerance (SaiRaj Bharath Yalamanchili) 21 Application-Aware Byzantine Fault Tolerance Every mission critical applications require to be fault tolerant to be resilient. Highly efficient Byzantine fault tolerance algorithms have been developed in the past decade 22 Why Application-Aware Byzantine Fault Tolerance? Standard state-machine-based Byzantine fault tolerance algorithms require deterministic application processing and sequential execution of totally ordered requests. These constraints would impede the adoption of the Byzantine fault tolerance techniques in practice, for example Practical systems often involve nondeterministic operations when they execute clients requests, and the states of the replicas would diverge if the application nondeterminism is not controlled. Sequential execution of all requests to the replicated server often results in unacceptable low system throughput and long end-to-end latency. 23 To overcome these issues, the application semantics must be tapped. Initially, PBFT, a networked file system (NFS) is replicated for Byzantine fault tolerance. PBFT exploited the application semantics of NFS to identify non deterministic operation and designed a control mechanism accordingly. Since PBFT, many other mechanisms and algorithms have been developed to facilitate the adoption of Byzantine fault tolerance techniques for practical distributed systems by exploiting application semantics at different levels 24 Why Application-Aware Byzantine Fault Tolerance? Strong argument that it is not practical to treat the sever application as a black box when replicating it for Byzantine fault tolerance and, that it is essential to exploit the application semantics. Identify scenarios when the replicated system may deadlock if all requests must be executed sequentially according to a total order. Provide a classification of various application aware Byzantine fault tolerance approaches based on the levels of application semantics that were exploited. Outline the mechanisms designed for achieving application-aware Byzantine fault tolerance. 25 What all covered in this paper? Motivations for taking an Application-aware approach to Byzantine fault tolerance Reduce Runtime Overhead Respect Causality and Avoid Deadlock Ensure Strong Replica Consistency 26 Reduce Runtime Overhead Two main types of runtime overhead introduced by Byzantine fault tolerance algorithms Communication and processing delays for each remote invocation due to the need for total ordering of requests, which impacts the end-to-end latency as seen by a client. The loss of concurrency degrees at the replicated server due to the sequential execution of requests, which impacts the system throughput. 27 Reduce Runtime Overhead By exploiting application semantics, we can introduce the following optimizations Reducing the end-to-end latency by minimizing the total ordering of requests. There is no need to totally order read-only requests (i.e., requests that do not modify the server state) Enabling concurrent processing at the server replicas for some requests. Independent requests can be delivered and processed concurrently. Similarly, commutative requests can be delivered and processed concurrently. 28 Respect Causality and Avoid Deadlock General purpose Byzantine fault tolerance algorithms are designed for simple client-server applications where the clients do not directly interact with each other and send requests to the replicated server independently. Causality violation. Deadlocks. In these cases, application semantics must be tapped to discover the causal ordering between requests and identify what requests must be delivered concurrently. Otherwise, the integrity and the availability of the system will be lost. 29 Ensure Strong Replica Consistency State-machine replication requires that the replicas behave deterministically when processing requests. However, many applications involve some form of non deterministic operations, such as taking a timestamp and accessing a pseudo random number generator. Without the knowledge of application semantics, it is usually impossible to know whether or not a request would trigger a nondeterministic operation. If such non deterministic operation is not controlled, the states of the replicas might diverge and the replicas might produce different replies to the client. 30 Classification of Application-Aware Byzantine Fault Tolerance (Xueli Xiao) 31 Classification 32 First Criterion - Types of Operations 33 First Criterion - Types of Operations Basic Operations: Read or write operation Creation or deletion of an object 34 First Criterion - Basic Operations The relationship between different requests is determined by the target objects and the corresponding operations. Two requests are considered dependent with each other: if they access at least one common state object and the operation on that object from any of the requests is a write operation. Two requests are considered independent with each other: if otherwise. Create and delete operations: create operations are independent from each other, and similarly delete operations are independent from each other. Create and delete operations are totally ordered with respect to each other. Requests that are dependent with each other: must be delivered and executed sequentially according to some total order at all replicas to ensure strong replica consistency. Independent requests: can be delivered and executed concurrently with respect to each other. 35 First Criterion - Types of Operations Complex Operations: Operation beyond basic operations Increment and decrement for integer numbers Append and truncate for text strings Sets a new value to a state object nondeterministically (using local clock value or by a pseudo random generator) 36 First Criterion - Complex Operations - The dependency among the requests can be analyzed in a similar manner as that for requests with basic operations by classifying complex operations into read- oriented and write-oriented types. - Complex operations might not be conflict with each other: Even if two requests access the same object with at least one of them engaging in a write-oriented operation, they might not be conflict with each other depending on the complex operations involved. - For example, if both requests are involved with the increment operation on the same state object, they are considered commutative and hence, can be delivered and processed concurrently. - To ensure strong replica consistency: The complexity of the operation could also require additional mechanisms. For example, if either the value to be returned to the client, or the new value to be assigned to the state object is determined based on the reading of the local clock, or on the output from a pseudo random number generator, additional mechanisms are needed to ensure that all replicas decide on the same value 37 Second Criterion - Context of the request 38 Second Criterion - Context of the request Context is considered: The requests issued by the same or different clients in the same session are correlated. Context is not considered: Requests from different clients are assumed to be sent independently 39 Second Criterion - Sessionless Interactions The requests are assumed to be sent independently. The replicas may determine a total order of the requests sent by different clients arbitrarily. 40 Second Criterion - Session-Oriented Interactions Requests sent within a session may be correlated. Constraints on ordering and handling the request: Total ordering of requests must respect causality: The most obvious constraint due to requests correlation is that the replicas can no longer impose a total order according to the order in which the requests are received because doing so might violate the causal ordering among the requests. Forced concurrent processing: In multi-tiered applications, nested remote invocations are common. If two or more replicated servers issue nested invocations to each other in response to requests sent from their clients, the nested requests must be delivered and executed prior to the initial requests are fully processed. Dictating a total order for the requests from the clients and those for nested invocations could lead to deadlocks. 41 42 Application Aware Byzantine Fault Tolerance Mechanisms (Heta Desai) 43 System Model Based on the system model introduced in Practical BFT Operating in an asynchronous distributed System Message exchange is protected by digital signature or message authentication code For Client-server application, server replication with high sufficient replication degree. For Multi-tier application middle tier or backend server is replicated 44 Requests with Basic Operations Read-only Requests If request does not change any of the state objexts- it can be delivered at a replica immediately To ensure client reads from nonfaulty replica, it collects f + 1 replies and match before a reply. Requests Partitioning Request is partitioned if on disjoint state objects. Request is separated based on identifier. Request that belong to different partitions executed concurrently handled by different groups of server replicas for scalability 45 Requests with Basic Operations Dependency Analysis Core of this mechanism predefined concurrency matrix Used to determine if two requests are independent based on operation and argument Message delivery control is taken care by parallelizer component Byzantine agreement layer and the server application 46 Parallelizer and Its interactions 47 The Parallelizer Tracks dependency according to concurrency matrix Concurrency matrix is based on application specific rules Assumption : each replica runs a pool of worker threads that actively fetch ordered requests Insert () request is placed on to-be- executed queue according to the total order Next_request() Invoked when a new worker thread is launched to block waiting Remove_request() Worker thread finishes executing. Removes from the queue. 48 Requests with Complex Operations Commutative Requests Requires the total ordering of all request. If commutative requests are not ordered : Requires 5f + 1 to tolerate f faulty replicas. Speculative execution > if wrong > recovery is expensive. Solution: implement a commutative replicated data types. Synchronized the state of replicas only when necessary or periodically. 49 Requests with Complex Operations Replica Nondeterminism Simple location specific nondeterminsm can be masked by a wrapper function This function translates location-specific values process id to a group-wide id to have strong replica consistency. The classification is based on: Can the nondeterministic operation and their associated values be verified before execution of a request? Can the nondeterministic value decided by one replica is verified by another replica? 50 Session-Oriented Interactions Preventing Causality Violation Arises when clients communicate directly with each other without the knowledge of replicated server. If we allow concurrent delivery of request. Avoiding Deadlocks Resolved with concurrent delivery of request that have parent-child relationship Source Ordering only 2f + 1 middle-tier server replicas are needed to tolerate up to f faulty replicas. A client & backend server must collect + 1 matching reply. 51 Session-Oriented Interactions Deferred Agreement Session: defines the beginning and end of interaction by group of participants. Different from batching agreement Applies to requests received overtime during a session Request Partitioning Usually requests that belong to different session handled by different state objects middle-tire server Requests can be often be simply partitioned based on sessions. 52 Conclusion Presented an overview of application aware Byzantine fault tolerance techniques Application semantics to build practical solutions for B. F. T Application semantics not only can be used to improve runtime performance Necessary to preserve the causality relationship of the requests and avoid potential deadlocks as well as control replica nondeterminism. 53 Reference Distributed Systems: Principles and Paradigms Andrew S. Tanenbaum and Maarten Van Steen, (Second Edition) Wenbing Zhao, "Application-Aware Byzantine Fault Tolerance," in Dependable, Autonomic and Secure Computing (DASC), 2014 IEEE 12th International Conference on, vol., no., pp.45-50, Aug doi: /DASC https://en.wikipedia.org/wiki/Byzantine_fault_t olerance, Accessed on 10 th November,2015. https://en.wikipedia.org/wiki/Byzantine_fault_t olerance 54

web based fault tolerance prepared by: shah muhammad hamdi krishanu sarker sairaj bharath...

Documents