software reliability [kehandalan perangkat lunak] catur iswahyudi

Software Reliability[Kehandalan Perangkat Lunak]

Catur Iswahyudi

Agenda

• What is Software Reliability?• Software Failure Mechanisms

– Hardware vs Software• Measuring Software Reliability• Software Reliability Models• Statistical Testing• Conclusion

Functional and Non-functional Requirements

• System functional requirements may specify error checking, recovery features, and system failure protection

• System reliability and availability are specified as part of the non-functional requirements for the system.

System Reliability Specification

• Hardware reliability– probability a hardware component fails

• Software reliability– probability a software component will produce an

incorrect output– software does not wear out– software can continue to operate after a bad result

• Operator reliability– probability system user makes an error

Failure Probabilities

• If there are two independent components in a system and the operation of the system depends on them both then

P(S) = P(A) + P(B)• If the components are replicated then the

probability of failure isP(S) = P(A)n

meaning that all components fail at once

Functional Reliability Requirements

• The system will check the all operator inputs to see that they fall within their required ranges.

• The system will check all disks for bad blocks each time it is booted.

• The system must be implemented in using a standard implementation of Ada.

Non-functional Reliability Specification

• The required level of reliability must be expressed quantitatively.

• Reliability is a dynamic system attribute.• Source code reliability specifications are

meaningless (e.g. N faults/1000 LOC)• An appropriate metric should be chosen to

specify the overall system reliability.

Hardware Reliability Metrics

• Hardware metrics are not suitable for software since its metrics are based on notion of component failure

• Software failures are often design failures• Often the system is available after the failure

has occurred• Hardware components can wear out

Software Reliability Metrics

• Reliability metrics are units of measure for system reliability

• System reliability is measured by counting the number of operational failures and relating these to demands made on the system at the time of failure

• A long-term measurement program is required to assess the reliability of critical systems

Reliability Metrics - part 1

• Probability of Failure on Demand (POFOD)– POFOD = 0.001– For one in every 1000 requests the service fails

per time unit• Rate of Fault Occurrence (ROCOF)

– ROCOF = 0.02– Two failures for each 100 operational time units

of operation

Reliability Metrics - part 2

• Mean Time to Failure (MTTF) – average time between observed failures (aka

MTBF) • Availability = MTBF / (MTBF+MTTR)

– MTBF = Mean Time Between Failure– MTTR = Mean Time to Repair

• Reliability = MTBF / (1+MTBF)

What is Reliability?

• Reliability is a broad concept.– It is applied whenever we expect something to behave in a certain way.

• Reliability is one of the metrics that are used to measure quality.• It is a user-oriented quality factor relating to system operation.

– Intuitively, if the users of a system rarely experience failure, the system is considered to be more reliable than one that fails more often.

• A system without faults is considered to be highly reliable.– Constructing a correct system is a difficult task.– Even an incorrect system may be considered to be reliable if the frequency of failure is

“acceptable.”• Key concepts in discussing reliability:

– Fault– Failure– Time– Three kinds of time intervals: MTTR, MTTF, MTBF

• Failure– A failure is said to occur if the observable outcome of a program

execution is different from the expected outcome.

• Fault– The adjudged cause of failure is called a fault.– Example: A failure may be cause by a defective block of code.

• Time– Time is a key concept in the formulation of reliability. If the time gap

between two successive failures is short, we say that the system is less reliable.

– Two forms of time are considered.• Execution time ()• Calendar time (t)

• MTTF: Mean Time To Failure• MTTR: Mean Time To Repair• MTBF: Mean Time Between Failures (= MTTF + MTTR)

Relationship between MTTR, MTTF, and MTBF.

What is Software Reliability

• the probability of failure-free software operation for a specified period of time in a specified environment

• Probability of the product working “correctly” over a given period of time.

• Informally denotes a product’s trustworthiness or dependability.

• Software Reliability is an important attribute of software quality, together with – functionality, – usability, – performance, – serviceability, – capability, – installability, – maintainability, – documentation.

• Software Reliability Modeling • Prediction Analysis • Reliability Measurement • Defect Classification• Trend Analysis • Field Data Analysis • Software Metrics • Software Testing and Reliability• Fault-Tolerance • Fault Trees • Software Reliability Simulation• Software Reliability Tools

FromHandbook of Software Reliability Engineering,

Edited by Michael R. Lyu, 1996

• Software Reliability is hard to achieve, because the complexity of software tends to be high.

• While the complexity of software is inversely related to software reliability, it is directly related to other important factors in software quality, especially functionality, capability.

• Two ways to measure reliability– Counting failures in periodic intervals

• Observer the trend of cumulative failure count - µ(). – Failure intensity

• Observe the trend of number of failures per unit time – λ().• µ()

– This denotes the total number of failures observed until execution time from the beginning of system execution.

• λ()– This denotes the number of failures observed per unit time after time

units of executing the system from the beginning. This is also called the failure intensity at time .

• Relationship between λ() and µ()– λ() = dµ()/d

Definitions of Software Reliability

• First definition– Software reliability is defined as the probability of failure-free operation of a

software system for a specified time in a specified environment.• Key elements of the above definition

– Probability of failure-free operation– Length of time of failure-free operation– A given execution environment

• Example– The probability that a PC in a store is up and running for eight hours without crash is

• Second definition– Failure intensity is a measure of the reliability of a software system operating

in a given environment.• Example: An air traffic control system fails once in two years.

• Comparing the two– The first puts emphasis on MTTF, whereas the second on count.

Factors Influencing Software Reliability

• A user’s perception of the reliability of a software depends upon two categories of information.– The number of faults present in the software.– The ways users operate the system.

• This is known as the operational profile.

• The fault count in a system is influenced by the following.– Size and complexity of code– Characteristics of the development process used– Education, experience, and training of development personnel– Operational environment

Applications of Software Reliability

• Comparison of software engineering technologies– What is the cost of adopting a technology?– What is the return from the technology -- in terms of cost and quality?

• Measuring the progress of system testing– Key question: How of testing has been done?– The failure intensity measure tells us about the present quality of the system:

high intensity means more tests are to be performed.• Controlling the system in operation

– The amount of change to a software for maintenance affects its reliability. Thus the amount of change to be effected in one go is determined by how much reliability we are ready to potentially lose.

• Better insight into software development processes– Quantification of quality gives us a better insight into the development

processes.

Software Failure Mechanisms

• Failure cause: Software defects are mainly design defects.

• Wear-out: Software does not have energy related wear-out phase. Errors can occur without warning.

• Repairable system concept: Periodic restarts can help fix software problems.

• Time dependency and life cycle: Software reliability is not a function of operational time.

• Environmental factors: Do not affect Software reliability, except it might affect program inputs.

• Reliability prediction: Software reliability can not be predicted from any physical basis, since it depends completely on human factors in design.

• Redundancy: Can not improve Software reliability if identical software components are used.

• Interfaces: Software interfaces are purely conceptual other than visual.

• Failure rate motivators: Usually not predictable from analyses of separate statements.

• Built with standard components: Well-understood and extensively-tested standard parts will help improve maintainability and reliability. But in software industry, we have not observed this trend. Code reuse has been around for some time, but to a very limited extent. Strictly speaking there are no standard parts for software, except some standardized logic structures.

Measuring Software Reliability

Don’t define what you won’t collect..

Don’t collect what you won’t analyse..

Don’t analyse what you won’t use..

• Measuring software reliability remains a difficult problem because we don't have a good understanding of the nature of software

• Even the most obvious product metrics such as software size have not uniform definition.

• Current practices of software reliability measurement can be divided into four categories: – Product metrics – Project management metrics – Process metrics – Fault and failure metrics

• Different categories of software products have different reliability requirements: – level of reliability required for a software

product should be specified in the SRS document.

• A good reliability measure should be observer independent, – so that different people can agree on the

reliability.

• LOC, KLOC, SLOC, KSLOC• McCabe's Complexity Metric• Test coverage metrics• ISO-9000 Quality Management Standards• MTBF

– Once a failure occurs, the next failure is expected after 100 hours of clock time (not running time).

• Failure Classes– Transient:

• Transient failures occur only for certain inputs.– Permanent:

• Permanent failures occur for all input values.– Recoverable:

• When recoverable failures occur the system recovers with or without operator intervention.

– Unrecoverable: • the system may have to be restarted.

– Cosmetic: • May cause minor irritations. Do not lead to incorrect results.

– Eg. mouse button has to be clicked twice instead of once to invoke a GUI function.

• Errors do not cause failures at the same frequency and severity.– measuring latent errors alone not enough

• The failure rate is observer-dependent• No simple relationship observed between system reliability and the

number of latent software defects.• Removing errors from parts of software which are rarely used

makes little difference to the perceived reliability.• removing 60% defects from least used parts would lead to only

about 3% improvement to product reliability. • Reliability improvements from correction of a single error depends

on whether the error belongs to the core or the non-core part of the program.

• The perceived reliability depends to a large extent upon how the product is used. In technical terms on its operation profile.

Software Reliability Models

• Software reliability models have emerged as people try to understand the characteristics of how and why software fails, and try to quantify software reliability

• Over 200 models have been developed since the early 1970s, but how to quantify software reliability still remains largely unsolved

• There is no single model that can be used in all situations. No model is complete or even representative.

• Most software models contain the following parts: – assumptions, – factors, – a mathematical function

• relates the reliability with the factors. • is usually higher order exponential or logarithmic.

• Software modeling techniques can be divided into two subcategories: – prediction modeling– estimation modeling.

• Both kinds of modeling techniques are based on observing and accumulating failure data and analyzing with statistical inference.

BoğaziçiUniversity Software Reliability Modelling

Computer Engineering

ISSUES PREDICTION MODELS ESTIMATION MODELSDATA REFERENCE Uses historical data Uses data from the

current software development effort

WHEN USED IN DEVELOPMENT CYCLE

Usually made prior to development or test phases; can be used as early as concept phase

Usually made later in life cycle(after some data have been collected); not typically used in concept or development phases

TIME FRAME Predict reliability at some future time

Estimate reliability at either present or some future time

• There are two main types of uncertainty which render any reliability measurement inaccurate:

• Type 1 uncertainty:– our lack of knowledge about how the system will be

used, i.e. • its operational profile

• Type 2 uncertainty:– reflects our lack of knowledge about the effect of fault

removal.• When we fix a fault we are not sure if the corrections are

complete and successful and no other faults are introduced• Even if the faults are fixed properly we do not know how

much will be the improvement to interfailure time.

• Step Function Model– The simplest reliability growth model:

• a step function model– The basic assumption:

• reliability increases by a constant amount each time an error is detected and repaired.

– Assumes:• all errors contribute equally to reliability growth• highly unrealistic:

– we already know that different errors contribute differently to reliability growth.

• Jelinski and Moranda Model– Realizes each time an error is repaired

reliability does not increase by a constant amount.

– Reliability improvement due to fixing of an error is assumed to be proportional to the number of errors present in the system at that time.

• Littlewood and Verall’s Model– Assumes different fault have different sizes,

thereby contributing unequally to failures.– Allows for negative reliability growth– Large sized faults tends to be detected and

fixed earlier– As number of errors is driven down with the

progress in test, so is the average error size, causing a law of diminishing return in debugging

• Variations exists– LNHPP (Littlewood non homogeneous

Poisson process) model• Goel – Okumoto (G-O) Imperfect

debugging model– GONHPP

• Musa – Okumoto (M-O) Logarithmic Poisson Execution Time model

• Applicability of models:– There is no universally applicable reliability

growth model.– Reliability growth is not independent of

application.– Fit observed data to several growth models.

• Take the one that best fits the data.

• Observed failure intensity can be computed in a straightforward manner from the tables of failure time or grouped data (e.g. Musa et al. 1987).

• Example: (136 failures total):• Failure Times (CPU seconds): 3, 33, 146, 227, 342, 351, 353,444, 556,

571, 709, 759, 836 ..., 88682.• Data are grouped into sets of 5 and the observed intensity, cumulative

failure distribution and mean failure times are computed, tabulated and plotted.

• Two common models are the "basic execution time model“ and the "logarithmic Poisson execution time model" (e.g. Musa et al. 1987).

• Basic Execution Time Model– Failure intensity λ(τ) with debugging time τ:

– where λ0 is the initial intensity and ν0 is the total expected number of failures (faults).

• where μ(τ) is the mean number of failures experienced by time τ.

• In this case the Logarithmic-Poisson Model fits somewhat better than the Basic Execution Time Model.

• In some other projects BE model fits better than LP model.

• Additional expected number of failures, Δμ, that must be experienced to reach a failure intensity objective

• where λP is the present failure intensity, and λF is the failure intensity objective. The additional execution time, Δτ, required to reach the failure intensity objective is

• After fitting a model describing the failure process we can estimate its parameters, and the quantities such as the total number of faults in the code, future failure intensity and additional time required to achieve a failure intensity objective.

Statistical Testing

• The objective is to determine reliability rather than discover errors.

• Uses data different from defect testing.

Statistical Testing

• Different users have different operational profile:– i.e. they use the system in different ways– formally, operational profile:

• probability distribution of input

• Divide the input data into a number of input classes:– e.g. create, edit, print, file operations, etc.

• Assign a probability value to each input class:– a probability for an input value from that class to be

selected.

Statistical Testing

• Determine the operational profile of the software:– This can be determined by analyzing the usage pattern.

• Manually select or automatically generate a set of test data: – corresponding to the operational profile.

• Apply test cases to the program:– record execution time between each failure– it may not be appropriate to use raw execution time

• After a statistically significant number of failures have been observed:– reliability can be computed.

Statistical Testing

• Relies on using large test data set. • Assumes that only a small percentage of test

inputs:– likely to cause system failure.

• It is straight forward to generate tests corresponding to the most common inputs:– but a statistically significant percentage of unlikely

inputs should also be included.• Creating these may be difficult:

– especially if test generators are used.

Conclusions

• Software reliability is a key part in software quality

• Software reliability improvement is hard • There are no generic models.• Measurement is very important for finding the

correct model.• Statistical testing should be used but it is not

easy again…• Software Reliability Modelling is not as simple as

described here.

Thank you.

software reliability [kehandalan perangkat lunak] catur iswahyudi

Documents

unimusrepository.unimus.ac.id/2790/1/8. peer review -abdul...

:.', final reportnurhayanto taryono cecep soetrisno affandi...

pt catur agrodaya mandiri - upl-ltd.com · pt catur...

and catur paramitha competitive sme's orientation ... ·...

power solution provider - sewatama · financial audit...

universitas indonesia analisa kehandalan jaringan...

piagam direksi pt federal international finance revisi 1 ·...

222.124.178.89222.124.178.89/dokumen/piagam internal...

28 eriana adeputri, rustikawati, dotti suryati dan catur

menanamkan konsep catur paramita pada anak usia … ·...

chesene.files.wordpress.com file · web viewlaporan final....

kalbar abar kalbar abar - audit board of indonesia ·...

pusat studi sistim intelijen - powered by gunadarma...

… · universitas pelita bangsa universitas ma'soem...

bermain catur seni lukis karya heri dono mistaram

jurnal pengaruh bukti fisik, kehandalan dan...

rama catur goeij yong sun arief rakhman group. arief rakhman

cs2624 cs2624 --computer computer ......2011/02/05 ·...

catur iswahyudi 1. 2 wired equivalent privacy (wep) ◦...

problem catur putih menang (chess problems - win)