lecture 2: combinatorial modeling cs 7040 trustworthy system design, implementation, and analysis...

78
Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at UIUC

Upload: bennett-blair

Post on 02-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Lecture 2: Combinatorial Modeling

CS 7040

Trustworthy System Design, Implementation, and Analysis

Spring 2015, Dr. Rozier

Adapted from slides by WHS at UIUC

Page 2: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Introduction

Page 3: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Introduction to Combinatorial Methods

• One of the simplest validation methods utilizing analytical/numerical techniques that can be used for reliability and availability modeling.

• Requires certain assumptions…

Page 4: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Combinatorial Assumptions

• Component failures are independent

• For availability, repairs are independent

When these assumptions hold, simple formulas for reliability and availability exist!

Page 5: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Defining Reliability

Page 6: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Reliability

• A key to trustworthy systems is the use of reliable components and systems.– Leads to high availability!

• Reliability: The reliability of a system at time t, R(t), is the probability that system operation is proper throughout the interval [0,t].

Page 7: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Reliability

• Reliability: The reliability of a system at time t, R(t), is the probability that system operation is proper throughout the interval [0,t].

• Probability theory and combinatorics can be applied directly to reliability models.

Page 8: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Reliability

• Reliability: The reliability of a system at time t, R(t), is the probability that system operation is proper throughout the interval [0,t].

• Let X be a random variable representing the time to failure of a component. The reliability at time t is given by:

Page 9: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Reliability

• Reliability: The reliability of a system at time t, R(t), is the probability that system operation is proper throughout the interval [0,t].

• Unreliability can be defined similarly as:

Page 10: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Probability Refresher

• A random variable X is unique determined by its set of possible values, , and and the associated probability distribution (or density) function (pdf), a real-valued functiondefined for each possible value as a probability that X has the value x

Page 11: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Probability Refresher

• The cumulative distribution function (cdf) of the discrete random variable X is the real valued function defined for each as

Page 12: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Probability Refresher

• The cumulative distribution function (cdf) of the continuous random variable X is the real valued function defined for each as

Page 13: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

PDFs and CDFs

Page 14: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Reliability and Unreliability

• Reliability:

• Unreliability:

Page 15: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Failure Rates

Page 16: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Failure Rate

• What is the rate at which a component fails at time t?– The probability that a component that has not yet

failed, fails in the interval

Note: We are not looking at

We are seeking

Page 17: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Failure Rate

Page 18: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Failure Rate

Page 19: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Failure Rate

Page 20: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Failure Rate

• is called the failure rate or hazard rate

Page 21: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Survival Function

• In addition to the reliability/hazard function we have the survival function

Page 22: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Survival Function

• In addition to the reliability/hazard function we have the survival function

Page 23: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Survival Function

• In addition to the reliability/hazard function we have the survival function

Page 24: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Survival and Hazard

• Hazard (or Failure) function – instantaneous failure rate at some time t.

• Survival function – the probability that the time of failure is later than some time t.

Page 25: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Typical Failure Rate

Page 26: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

System Reliability

Page 27: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

System Reliability

• While can give the reliability of a component, how do you compute the reliability of a system?

Page 28: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

System Reliability

System failure can occur when one, all, or some of the components fail. If one makes the independent failure assumption, system failures can be computed quite simply.

The independent failure assumption states that all component failures of a system are independent, i.e., the failure of one component does not cause another component to be more or less likely to fail.

Page 29: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

System Reliability

• Given this assumption, we can determine:– Minimum failure time of a set of components– Maximum failure time of a set of components– Probability that k of N components have failed at a

particular time t.

Page 30: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Maximum of n Independent Failure Times

• Let be independent component failure times. Suppose the system fails at time S if all the components fail.

• Thus,

• What is ?

Page 31: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Maximum of n Independent Failure Times

Page 32: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Maximum of n Independent Failure Times

By independence

Page 33: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Maximum of n Independent Failure Times

By definition

Page 34: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Maximum of n Independent Failure Times

Page 35: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Minimum of n Independent Failure Times

• Let be independent component failure times. A system fails at time S if any of the components fail.

• Thus,

• What is ?

Page 36: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Minimum of n Independent Failure Times

• What is ?

Page 37: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Minimum of n Independent Failure Times

• What is ?

• Trick: If is an event, and is the set complement such that and , then

Page 38: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Minimum of n Independent Failure Times

Page 39: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Minimum of n Independent Failure Times

By trick

Page 40: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Minimum of n Independent Failure Times

By independece

Page 41: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Minimum of n Independent Failure Times

By LOTP

Page 42: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Minimum of n Independent Failure Times

Page 43: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

k of N

• Let be component failure times that have identical distributions (i.e., ). The system has failed by time S if k or more of the N components have failed by S. P[at least k components failed by time t] = P[exactly k failed OR exactly k+1 failed …] = P[exactly k failed] + P[exactly k+1 failed] …

Page 44: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

k of N

• What is P[exactly k failed]? = P[k failed and (N – k) have not]

where is the failure distribution of each component

Page 45: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

k of N in General

• For non-identical failure distributions, we must sum over all combinations of at least k failures.

• Let be the set of all subsets ofsuch that each element in is a set of size at least k, i.e.,

Page 46: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

k of N in General

• The set represents all the possible failure scenarios.

• Now is given by

Page 47: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Component Building Blocks

• Complex systems can be analyzed hierarchically.

Example: A computer fails if both power supplies fail, or both memories fail, or if the CPU fails.

System problem is one of a minimum: the system fails when the first of three subsystems fails…

Page 48: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Component Building Blocks

• Power supply subsystem is a maximum: both must fail

• Memory supply subsystem is a maximum: both must fail

Page 49: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Summary

A system comprises N components, where the component failure times are given by the random variables . The system fails at time S with distribution if:

Condition Distribution

All components fail

One component fails

k components fail, identical distributions

k components fail, general case

Page 50: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Reliability Formalisms

Page 51: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Reliability Formalisms

• There are several popular graphical formalisms to express system reliability. The core of the solvers for these formalisms are the methods we have just examined. We will discuss a subset of these formalisms:– Reliability Block Diagrams– Fault Trees– Reliability Graphs

Page 52: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Reliability Formalisms

• There are several popular graphical formalisms to express system reliability. The core of the solvers for these formalisms are the methods we have just examined. We will discuss a subset of these formalisms:– Reliability Block Diagrams– Fault Trees– Reliability Graphs

There is nothing special about these formalisms except their popularity.

Page 53: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

What is a Graphical Formalism

• A way to draw visual diagrams with formal underlying mathematical meanings.

Page 54: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Reliability Block Diagrams

• Blocks represent components• A system failure occurs if there is no path from

source to sink.

Page 55: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Reliability Block Diagrams

• Series:– System fails if any component fails

Page 56: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Reliability Block Diagrams

• Parallel:– System fails if all components fail

Page 57: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Reliability Block Diagrams

• k of N:– System fails if at least k of N components fail.

Page 58: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Example

A NASA satellite architecture under study is designed for high reliability. The major computer system components include the CPU system, the high-speed network for data collection and transmission, and the low-speed network for engineering and control. The satellite fails if any of the major systems fail.

There are 3 computers, and the computer system fails if 2 or more of the computers fail. Failure distribution of a computer is given by

There is a redundant (2) high-speed network, and the high-speed network system fails if both networks fail. The distribution of a high-speed network failure is given by

The low-speed network is arranged similarly, with a failure distribution of

Page 59: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Example

Page 60: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Example

Page 61: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Example

Page 62: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Example

Page 63: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Background: Series-Parallel Graphs

Page 64: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Series-Parallel Decomposition of NASA Example

Page 65: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Fault Trees

Page 66: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Fault Tree Example

Page 67: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Reliability Graphs

Page 68: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Reliability Graph Example

Page 69: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Solve by Conditioning

Page 70: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at
Page 71: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Solve by Conditioning

Page 72: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Conditioning Fault Trees

Page 73: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Reliability/Availability Point Estimates

Page 74: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Reliability/Availability Tables

Page 75: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Reliability Modeling Process

Page 76: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Reliability Modeling Process

Page 77: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

Reliability Modeling Process

Page 78: Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at

For next time

• Homework 1!• Due next Tuesday

• Review combinatorics