synthesis of fault-tolerant distributed programs

27
Synthesis of Fault-Tolerant Distributed Programs Ali Ebnenasir Department of Computer Science and Engineering Michigan State University East Lansing MI 48824 USA [email protected]. edu Advisor: Dr. Sandeep S. Kulkarni

Upload: kyna

Post on 06-Jan-2016

26 views

Category:

Documents


0 download

DESCRIPTION

Synthesis of Fault-Tolerant Distributed Programs. Ali Ebnenasir Department of Computer Science and Engineering Michigan State University East Lansing MI 48824 USA [email protected] Advisor: Dr. Sandeep S. Kulkarni. Motivation. Programs are subject to unanticipated faults - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Synthesis of Fault-Tolerant Distributed Programs

Synthesis of Fault-Tolerant Distributed Programs

Ali Ebnenasir

Department of Computer Science and EngineeringMichigan State University

East Lansing MI 48824 [email protected]

Advisor: Dr. Sandeep S. Kulkarni

Page 2: Synthesis of Fault-Tolerant Distributed Programs

2

Motivation Programs are subject to unanticipated faults

New classes of faults, add corresponding fault-tolerance

How to add fault-tolerance? Design a fault-tolerant program from scratch Incremental addition of fault-tolerance

How to ensure correctness? Verification after the fact Automatic synthesis of fault-tolerant programs

(correct by construction)

Page 3: Synthesis of Fault-Tolerant Distributed Programs

3

Motivation (Continued) Synthesis of fault-tolerant programs

Start from (Temporal Logic) specification Start from the fault-intolerant program

Synthesis of fault-tolerant programs from their fault-intolerant versions has the potential to

Reuse the behaviors of the fault-intolerant program Preserve behaviors that are hard to specify (e.g.,

efficiency)

Problem: Complexity of synthesis A polynomial-time non-deterministic algorithm for the

synthesis of fault-tolerant distributed programs [FTRTFT00]

Page 4: Synthesis of Fault-Tolerant Distributed Programs

4

Outline

Program and Fault Model

Distribution Model

Problem Statement

Strategy

Current Results

Future Plan

Page 5: Synthesis of Fault-Tolerant Distributed Programs

5

Program and Fault Model Program is identified by its state space and set of

transitions Finite State space Sp Invariant S, fault-span T Sp

Program p, Fault f, Safety { (s0, s1) | (s0, s1) Sp Sp }

Fault-tolerance Satisfy a particular fault-tolerance specification in the presence of

faults Failsafe, Nonmasking, MaskingST

p/f p

f

Sp

Page 6: Synthesis of Fault-Tolerant Distributed Programs

6

Distribution Model Read/Write restrictions Example

A program p with two processes j and k Two Boolean variables a and b Process j cannot read b Can we include the following transition?

a=0,b=0 a=1,b=0

Groups of transitions (instead of individual transitions) must be chosen

a=0,b=1 a=1,b=1

Only if we include the transition

Page 7: Synthesis of Fault-Tolerant Distributed Programs

7

Problem Statement

Synthesis Algorithm

Fault-intolerant program p

Specification Spec

Invariant S

Fault-tolerant program p'

Invariant S'Faults f

No new transition here New transitions added here

S S'p

Finite state space

Distribution restrictions

Sp f

Page 8: Synthesis of Fault-Tolerant Distributed Programs

8

Strategy

Theoretical issues Develop heuristics Explore polynomial-time boundaries Analyze fault-intolerant programs

Develop a synthesis framework for Developers of fault-tolerance Developers of heuristics

Page 9: Synthesis of Fault-Tolerant Distributed Programs

9

Theoretical Issues - Heuristics

Apply heuristics to reduce the exponential complexity [SRDS01]

Assign weights to transitions and states based on their usefulness

Different approaches for resolving deadlocks and livelocks

Identify the applicability of heuristics to the problem at hand

Choose different subsets of heuristics Apply in different order

Page 10: Synthesis of Fault-Tolerant Distributed Programs

10

Theoretical Issues – Polynomial-Time Boundary

Find properties of programs/specifications where polynomial-time synthesis is possible

Example: Algorithmic synthesis of failsafe fault-tolerant

programs is NP-hard [ICDCS02]

Polynomial-time synthesis of failsafe fault-tolerance for monotonic programs and specification

Page 11: Synthesis of Fault-Tolerant Distributed Programs

11

Example for Polynomial-Time Boundary:

Monotonicity of SpecificationsDefinition: A specification spec is positive monotonic with respect to

variable x iff: For every s0, s1, s’0, s’1:

The value of all other variables in s0 and s’0 are the same. The value of all other variables in s1 and s’1 are the same.

s1s0

x = falsex = false

If

Does not violate safety

s’0 s’1

x = truex = true

Does not violate safety

Then

Page 12: Synthesis of Fault-Tolerant Distributed Programs

12

Example for Polynomial-Time Boundary:

Monotonicity of ProgramsDefinition: Program p with invariant S is negative monotonic with respect to

variable x iff: For every s0, s1, s’0, s’1:

The value of all other variables in s0 and s’0 are the same. The value of all other variables in s1 and s’1 are the same.

Invariant S

s1s0

x = truex = true

s’0 s’1

x = falsex = false

Page 13: Synthesis of Fault-Tolerant Distributed Programs

13

Example for Polynomial-Time Boundary: Theorem

Synthesis of failsafe fault-tolerance can be done in polynomial time if either:

Program is negative monotonic, and Spec is positive monotonic;

Or Program is positive monotonic, and Spec is negative monotonic.

If only one of these conditions is satisfied then synthesizing failsafe fault-tolerance is still NP-hard.

For many problems, these requirements are easily met. E.g., Agreement, Consensus, and Commit.

Page 14: Synthesis of Fault-Tolerant Distributed Programs

14

Example for Polynomial-Time Boundary: Byzantine Agreement

Processes: General, g, and three non-generals j, k, and l Variables

d.g : {0, 1} d.j, d.k, d.l : {0, 1, ┴ } b.g, b.j, b.k, b.l : {true, false} f.j, f.k, f.l : {0, 1}

Fault-intolerant program transitions d.j = ┴ /\ f.j = 0 d.j := d.g d.j ≠ ┴ /\ f.j = 0 f.j := 1

Fault transitions ¬b.g /\ ¬b.j /\ ¬b.k /\ ¬b.l b.j := true b.j d.j :=0|1

g

lkj

Page 15: Synthesis of Fault-Tolerant Distributed Programs

15

Example for Polynomial-Time Boundary: Byzantine Agreement

(Continued) Safety Specification

Agreement: No two non-Byzantine non-generals can finalize with different decisions

Validity: If g is not Byzantine, each non-Byzantine non-general process should finalize with the same decision as g

Read/Write restrictions Readable variables for process j:

b.j, d.j, f.j d.g, d.k, d.l

Process j can write d.j, f.j

Page 16: Synthesis of Fault-Tolerant Distributed Programs

16

Example for Polynomial-Time Boundary: Byzantine Agreement

(Continued)

Observation 1: Positive monotonicity of specification with respect to b.j

Observation 2: Negative monotonicity of program, consisting of the

transitions of j, with respect to b.k Observation 3:

Negative monotonicity of specification with respect to f.j

Observation 4: Positive monotonicity of program, consisting of the

transitions of j, with respect to f.k

Page 17: Synthesis of Fault-Tolerant Distributed Programs

17

Example for Polynomial-Time Boundary: Byzantine Agreement

(Continued)

Failsafe fault-tolerant program.

d.j = ┴ /\ f.j = 0 d.j := d.g d.j ≠ ┴ /\ ((d.j = d.k) \/ (d.j = d.l)) /\ f.j = 0 f.j := 1

Page 18: Synthesis of Fault-Tolerant Distributed Programs

18

Theoretical Issues – Analysis of Fault-Intolerant

Programs

Analyze the behavior and the structure of the fault-intolerant program.

Example: Reasoning about the program in high atomicity; i.e.,

no distribution restrictions. Enhancement of fault-tolerance [ICDCS03].

Take advantage of model checkers.

Page 19: Synthesis of Fault-Tolerant Distributed Programs

19

Theoretical Issues – Analysis of Fault-Intolerant

Programs

SynthesisFramework

The SPIN Model Checker

Fault-tolerant program

Intermediate program in Promela

Fault-intolerant program

Counterexample

Page 20: Synthesis of Fault-Tolerant Distributed Programs

20

Theoretical Issues: Current Results

Intolerant Program

Masking fault-tolerant

[FTR

TFT

00

]

Failsafe fault-tolerant

[ICDCS02]

Nonmasking fault-tolerant

[ICDCS03]

Page 21: Synthesis of Fault-Tolerant Distributed Programs

21

Synthesis Framework Goals:

Algorithmic synthesis of fault-tolerant programs from their fault-intolerant versions.

Easy to integrate new heuristics. Easy to change its implementation.

Users: Developers of fault-tolerance. Developers of heuristics.

Examples: A canonical version of Byzantine agreement. An agreement program that is subject to Byzantine and

failstop faults (1.3 million states). A token ring program perturbed by state-corruption faults.

Page 22: Synthesis of Fault-Tolerant Distributed Programs

22

Related Work E.A. Emerson and E.M. Clarke, Using branching time temporal

logic to synthesize synchronization skeletons, 1982.

Z. Manna and P. Wolper, Synthesis of communicating processes from temporal logic specifications, 1984.

A. Arora, P.C. Attie, and E.A. Emerson, Synthesis of fault-tolerant concurrent programs, 1998.

P.C. Attie, and E.A. Emerson, Synthesis of concurrent programs for an atomic read/write model of computation, 1996.

O. Kupferman and M. Vardi, Synthesis with incomplete information, 1997.

Page 23: Synthesis of Fault-Tolerant Distributed Programs

23

Future Plan

Theoretical issues Develop more intelligent heuristics to reduce the

chance of failure in the synthesis Find polynomial-time boundary for other levels of

fault-tolerance

Synthesis framework issues Scalability of the synthesis framework for larger

programs Implement the synthesis algorithm on a distributed platform

Page 24: Synthesis of Fault-Tolerant Distributed Programs

24

Future Plan - Continued Synthesis framework issues

Use model checkers for behavioral analysis Query

Intermediate program Reachability analysis from a given state

Result set Deadlock states Non-progress cycles Finite sequence of states

Page 25: Synthesis of Fault-Tolerant Distributed Programs

25

Publications [ICDCS02] Sandeep S. Kulkarni and Ali Ebnenasir. The Complexity

of Adding Failsafe Fault-Tolerance. The 22nd International Conference on Distributed Computing Systems, July 2-5, 2002 - Vienna, Austria.

[ICDCS03] Sandeep S. Kulkarni and Ali Ebnenasir. Enhancing The Fault-Tolerance of Nonmasking Programs.  Accepted in the 23rd International Conference on Distributed Computing Systems, May 19-22, 2003 - Providence, Rhode Island USA.

[SRDS03] Sandeep S. Kulkarni and Ali Ebnenasir. A Framework for Automatic Synthesis of Fault-Tolerance. Submitted to The 22nd Symposium on Reliable Distributed Systems 6th-8th/October, 2003 - Florence, Italy.

The implementation of the synthesis framework: http://www.cse.msu.edu/~sandeep/software/Code/synthesis-framework/

Page 26: Synthesis of Fault-Tolerant Distributed Programs

26

Thank You!

Questions and Comments?

Page 27: Synthesis of Fault-Tolerant Distributed Programs

27

Reduction from 3-SAT

Included iff x0 is false

Included iff x0 is true

Included iffxj is false

Included iffxk is true

Included iffxl is false

cj = xj \/ xk \/ xl

_

an = a0a0

x0 x1

x’0 x’1x’n

xn