reliability and quality - predicting post-release defects using pre-release field testing results

PREDICTING POST-RELEASE

DEFECTS USING PRE-RELEASE

FIELD TESTING RESULTS

Foutse Khomh, Brian Foutse Khomh, Brian

Chan, Ying Zou

Anand Sinha, Dave Dietz

FIELD TESTING CYCLE

Field testing is important to improve the quality of

an application before release.2

MEAN TIME BETWEEN

FAILURE

Mean Time Between Failures (MTBF) is frequently

used to gauge the reliability of the application.

Applications with a low MTBF are undesirable

since they would have a higher number of

defects

AVERAGE USAGE TIME

� AVT is the average time that a user actively uses the

application.

� The AVT can be longer than the period of field testing.

A longer AVT indicates that an application is

reliable and a user tends to use the application

longer.

PROBLEM STATEMENT

� MTBF and AVT cannot capture the whole

pattern of failure occurrences in the field testing

of an application.

The reliability of A and B is very different.

METRICS

� We propose three metrics that capture additional

patterns of failure occurrences:

� TTFF: the average length of usage time before

the occurrence of the first failure, the occurrence of the first failure,

� FAR: the failure accumulation rating to gauge

the spread of failures to the majority of users,

� OFR: the overall failure ratio that captures

daily rates of failures. 6

AVERAGE TIME TO FIRST

FAILURE (TTFF)

VersionA

% of users reporting failures

1 2 3 4 5 6 7 8 9 10 11 12 13 14

FAILURE (TTFF)

VersionA VersionB

1 2 3 4 5 6 7 8 9 10 11 12 13 14

FAILURE (TTFF)

VersionA VersionB

% of users

reporting failures

1 2 3 4 5 6 7 8 9 1011121314

Daysreporting failures

TTFF produces high scores for applications

where the majority of users experience the

first failure late.

FAILURE (TTFF)

VersionA VersionB

1 2 3 4 5 6 7 8 9 10 11 12 13 14

TTFFB = 3.56

TTFFA = 6.11

FAILURE ACCUMULATION

RATING (FAR)

% of users reporting

1 2 3 4 5 6 7 8 9 10 11 12 13 14

VersionA

Number of unique failures

RATING (FAR)

1 2 3 4 5 6 7 8 9 10 11 12 13 14

VersionA

VersionB

RATING (FAR)

1 3 5 7 9 11 13% of users reportingNumber of unique failures

The FAR metric produces high scores for

applications where the majority of users report

a very low numbers of failures.

RATING (FAR)

FARB = 4.97

1 2 3 4 5 6 7 8 9 10 11 12 13 14

VersionA

VersionBFARA = 6.97

OVERALL FAILURE RATING

VersionA

1 2 3 4 5 6 7 8 9 10 11 12 13 14

VersionA VersionB

1 2 3 4 5 6 7 8 9 10 11 12 13 14

VersionA VersionB

failures

1 3 5 7 9 11 13

failures

The OFR metric produces high scores for

applications with fewer users reporting

failures overall.

VersionA VersionB OFRB = 0.78

OFRA = 0.93

1 2 3 4 5 6 7 8 9 10 11 12 13 14

CASE STUDY

We analyze 18 versions of an enterprise software

application

� Overall 2,546 users were involved in the field

testingtesting

� The testing period lasted 30 days

SPEARMAN CORRELATION

OF THE METRICS

TTFF FAR OFR AVT MTBF

TTFF 1 0.09 -0.08 -0.31 -0.08

FAR 0.09 1 0.07 0.33 -0.24

OFR -0.08 0.07 1 0.39 -0.54

AVT -0.31 0.33 0.39 1 -0.3

MTBF -0.08 -0.24 -0.54 -0.3 1

INDEPENDENCY AMONG

PROPOSED METRICS

PC1 PC2 PC3 PC4

PREDICTIVE POWER FOR

POST-RELEASE DEFECTS

square

TTFF FAR OFR AVT MTBF

6 months

1 year

2 years

Metrics

Marginal R-square

PRECISION OF PREDICTIONS

WITH ALL FIVE METRICS

5 10 15 20 25 30

6 months

1 year

2 years

Precision (%)

Number of testing days

CONCLUSION

� TTFF, FAR, and OFR complement the traditional

MTBF and AVT in predicting the number of post-

release defects

� Provide faster predictions of the number of post-� Provide faster predictions of the number of post-

release defects with good precision within just 5

days of a pre-release testing period

� It takes MTBF up to 25 days to predict the

number of post-release defects

reliability and quality - predicting post-release defects using pre-release field testing results

failures mtbf

number of testing days

number of unique failures

average time

failure accumulationrating

overall failure ratio

majority of users reporta

average usage time avt

Technology

ccc.illinois.educcc.illinois.edu/s/publications/90_modeling...

predicting post-release defects in oo software using product...

dx.doi.org/10.14227/dt160309p28 - dissolution...

modeling methodology for predicting sco performance … ·...

a user-friendly, two-zone heat release model for ......

a user-friendly, two-zone heat release model for ... · a...

defects in solids: point defects and line defects

advanced design system 2001 release...

predicting release time based on software reliability model

clarity ppm 15.7.1| resolved defects...clarity ppm 15.7.1|...

defects in...

priority defects for go-live - detailed report delta ... mms...

predicting the drug release kinetics of matrix tablets ·...

rta-swcl v3.2.0 release notes · 2021. 1. 13. · etas...

adopting six sigma approach in predicting functional defects...

a method for predicting defects in system testing for...

predicting defects for eclipse

molding defects, causes, & corrective actions...molding...

predicting defect behavior in b2 intermetallics by merging...

predicting defects using change genealogies (isse 2013)