2009-a review on reliability models with covariates
TRANSCRIPT
QUT Digital Repository: http://eprints.qut.edu.au/
Gorjian, Nima and Ma, Lin and Mittinty, Murthy and Yarlagadda, Prasad and Sun, Yong (2009) A review on reliability models with covariates. In: Proceedings of the 4th World Congress on Engineering Asset Management, 28-30 September 2009, Marriott Athens Ledra Hotel, Athens.
© Copyright 2009 Springer
Proceedings of the 4th
World Congress on Engineering Asset Management
Athens, Greece
28 - 30 September 2009
A REVIEW ON RELIABILITY MODELS WITH COVARIATES
Nima Gorjian a, b
, Lin Ma a, b
, Murthy Mittinty c, Prasad Yarlagadda
b, Yong Sun
a, b
a Cooperative Research Centre for Integrated Engineering Asset Management (CIEAM), Brisbane, Australia
b School of Engineering Systems, Queensland University of Technology (QUT), Brisbane, Australia
c School of Mathematical Sciences, Queensland University of Technology (QUT), Brisbane, Australia
Modern Engineering Asset Management (EAM) requires the accurate assessment of current and the prediction of
future asset health condition. Suitable mathematical models that are capable of predicting Time-to-Failure (TTF)
and the probability of failure in future time are essential. In traditional reliability models, the lifetime of assets is
estimated using failure time data. However, in most real-life situations and industry applications, the lifetime of
assets is influenced by different risk factors, which are called covariates. The fundamental notion in reliability
theory is the failure time of a system and its covariates. These covariates change stochastically and may
influence and/or indicate the failure time. Research shows that many statistical models have been developed to
estimate the hazard of assets or individuals with covariates. An extensive amount of literature on hazard models
with covariates (also termed covariate models), including theory and practical applications, has emerged. This
paper is a state-of-the-art review of the existing literature on these covariate models in both the reliability and
biomedical fields. One of the major purposes of this expository paper is to synthesise these models from both
industrial reliability and biomedical fields and then contextually group them into non-parametric and semi-
parametric models. Comments on their merits and limitations are also presented. Another main purpose of this
paper is to comprehensively review and summarise the current research on the development of the covariate
models so as to facilitate the application of more covariate modelling techniques into prognostics and asset
health management.
Key Words: Covariate model, Hazard, Reliability analysis, Survival analysis, Asset health, Life prediction
1 INTRODUCTION
In recent years, the emphasis on prognostics and asset life prediction has increased in the area of Engineering Asset
Management (EAM) due to longer-term planning and budgeting requirements. One essential scientific research problem in
EAM is the development of mathematical models that are capable of predicting Time-To-Failure (TTF) and the probability of
failures in future time. In most real-life situations and industry applications, the hazard (failure rate) of assets is influenced
and/or indicated by different risk factors, which are often termed as covariates. Probabilistic modelling of assets lifetime using
covariates (i.e. diagnostic factors and operating environment factors) is one of the indispensible scientific research problems
for prognostics and asset life prediction.
Until now, a number of statistical models have been developed to estimate the hazard of an asset/individual with covariates
in both the reliability and biomedical fields. Most of these models are developed based on the Proportional Hazard Model
(PHM) theory which was proposed by Cox in 1972 [1]. The basic theory of these covariate models is to build the baseline
hazard function using historical failure data and the covariate function using covariate data. There are few review papers on the
covariate models which have been reported in the literature. Kumar and Klefsjo [2] reviews the existing literature on the
proportional hazard model. Kumar and Westberg [3] provides a review of some reliability models for analysing the effect of
operating conditions on equipment lifetime. Ma [4] discusses new research directions for Condition Monitoring (CM) and
reviews some prognostic models in EAM.
Almost all existing covariate models have been applied in the biomedical field. However, some of them have been applied
in the reliability area. This expository paper is a collective review of the existing literature on covariate models in both the
reliability and biomedical fields. In this paper, each individual covariate model has been contextually grouped into non-
parametric and semi-parametric models. Moreover, comments on their merits and limitations are discussed. Applications of
each model in both biomedical and reliability field are also presented. The purpose of this study is to facilitate the application
of more covariate modelling techniques into prognostics and asset life prediction.
The reminder of this paper is organised as follows. Section 2 classifies these covariate models into two groups and then
explains them in greater detail. In this section, the merits, limitations, and applications of each model are discussed. Section 3
provides the conclusions of this paper.
2 SURVIVAL / RELIABILITY MODELS WITH COVARIATES
Survival / reliability analysis (also called failure time analysis) is a specific field of statistics that studies failure time and its
probability on a group or groups of assets/individuals. Failure time is a defined point event, often called failures, occurring
after a length of time. Some examples of failure times include the lifetimes of machine components in reliability field, the
survival times of patients in a clinical trial, and durations of economic recessions in economics. Survival analysis was
advanced at the UC Berkeley in 1960s to present a better analysis method for Life Table data [5]. The development of
statistical procedures and models for survival analysis exploded in the 1970s. In the 1980s and early 1990s, survival models
with covariates have been widely applied in both the reliability and biomedical research. In general, the survival / reliability
models with covariates can be classified into two groups as: non-parametric and semi-parametric models, which are discussed
in the following sub-sections.
2.1 Non-Parametric Models
In non-parametric models, the form of degradation paths or distribution of degradation measure is unspecified [6]. When
the failure time data involve complex distributions that are largely unknown, or when the number of observations is small, it is
difficult to accurately fit a known failure time distribution. In other words, non-parametric models are used to avoid making
unrealistic assumptions that would be difficult to test [7]. Such models to be reviewed include:
Proportional hazard model
Stratified proportional hazard model
Two-step regression model
Additive hazard model
Mixed (additive-multiplicative) model
Accelerated failure time model
Extended hazard regression model
Proportional intensity model
Proportional odds model
Proportional covariate model
2.1.1 Proportional Hazard Model
The Proportional Hazard Model (PHM), which is a multivariate regression analysis, was first proposed by Cox [1] in 1972.
This model estimates the effects of different covariates influencing TTF of a system. This model has been employed for
different applications in lifetime data analysis. Due to its generality and flexibility, PHM was quickly and widely adopted in
the biomedical, reliability, and economics from 1970s to early 1990s. Almost all covariate models are based on PHM theory.
Cox’s PHM for static explanatory variables is expressed as [1, 8]:
𝑡; 𝒛 = 0 𝑡 𝜓(𝜸𝒛) (1)
Where, 0(𝑡) is the unspecified baseline hazard function which is dependent on time only and without influence of
covariates. The positive functional term,𝜓(𝜸𝒛), is dependent on the effects of different factors, which have multiplicative
effect (rather than additive) on the baseline hazard function.
The proportionality assumption in PHM is that (𝑡; 𝑧𝑋) 𝑡; 𝑧𝑌 = 0 𝑡 exp(𝑧𝑋𝛾) 0 𝑡 exp(𝑧𝑌𝛾) = exp 𝛾(𝑧𝑋 − 𝑧𝑌) .
The hazard at different 𝑧 values are in constant proportion for all 𝑡 > 0, hence the name for PHM [9-11].
Cox’s PHM for dynamic explanatory variables is [8, 12]:
𝑡; 𝒛(𝑡) = 0 𝑡 𝜓(𝜸𝒛(𝑡)) (2)
Three parameterizations of 𝜓 may be considered as log linear, linear, and logistic forms [8]. The log linear form has
become the most popular for good reasons. Covariates are represented by a row vector consisting of the covariates 𝒛 and 𝜸 is a
column vector consisting of the regression coefficients. The covariate 𝑧 is associated with the system, and 𝛾 is the unknown
parameter of the model, defining the effects of the covariates. Cox [1, 13] propose the conditional likelihood, which later is so
called partial likelihood, to estimate regression coefficients (regression coefficients) (𝛾). Kalbfleisch and Prentice [12, 14]
propose the marginal likelihood which is identical to partial likelihood. Their likelihood can deal with tied data
(assets/individuals which failed simultaneously), censored data and uncensored (observed) data. Regression coefficients are
estimated by maximising the partial likelihood.
A number of graphical techniques, goodness-of-fit techniques, and confidence intervals techniques can be employed to
examine the appropriateness and fit of the PHM [15-17]. Since 1972, a number of applications of PHM in the reliability [10,
11, 15, 18-43] and biomedical fields have been developed.
Key merits
PHM is an influential technique which can be used to investigate the effects of various explanatory variables on hazard
of assets/individuals This approach is essentially distribution free, thus it does not have to assume a specific form for the baseline hazard
function Regression coefficients are estimated using partial likelihood without the need of specifying the baseline hazard
function This model is available for both static and dynamic explanatory variables
Explanatory variables have a multiplicative effect (rather than additive effect) on the baseline hazard function, thus it is
a more realistic and reasonable assumption
This model handles truncated, non-truncated data, and tied values
Many goodness-of-fit tests and graphical methods are available for this model [17]
Key limitations
PHM is a vulnerable approach when covariates are deleted or the precision of covariate measurements is changed.
Therefore, if one pertinent covariate is omitted, even if it is independent of the other covariates in the model, averaging
on the omitted covariate gives a new model which leads to biased estimates of regression coefficients (𝛾) [10]
The estimated value of the regression coefficient is biased in the case of a small sample size
Mixing different types of covariates in one model may cause some problems
The main assumption of this model is that an asset/individual life is assumed to be terminated at the first failure time.
In other words, this model depends only on the time 𝑡 elapsed between the starting event (e.g. diagnosis) and the
terminal event (e.g. fail) and not on the chronological time 𝑡
The influence of a covariate in PHM is assumed to be time-independent [44]
Due to proportionality assumption, a common baseline hazard for all assets/individuals has been assumed in a case in
which the assets/individuals should be stratified according to baseline [12, 17]
Proportionality assumption imposes a severe limitation which that survival curves for assets/individuals with different
covariates can never cross [45-49]
2.1.2 Stratified Proportional Hazard Model
In the Stratified Proportional Hazard Model (SPHM), it is assumed that the population can be divided into 𝑞 strata (or
levels), based on the discrete values of a single covariate or combination of discrete values of a set of covariates [3]. For
example, an asset operates at three different temperature levels; say low, medium, and high. In SPHM, it is assumed that the
hazard is proportional within the same stratum (or level) but not necessary across different strata. The hazard of a system in the
𝑗𝑡 stratum can accordingly be expressed as [12]:
𝑗 𝑡; 𝒛 = 0𝑗 𝑡 exp(𝜸𝒛) (3)
A similar likelihood method to Cox’s PHM is used to estimate regression coefficients (𝜸). The model is applied in
biomedical by Kay [50]. This model applied by Kumar [31] in the reliability field.
Key merits
The arbitrary baseline hazard function is allowed to be different for each stratum whereas the regression coefficients
are the same across strata
SPHM is one of the simplest and most useful extensions of the PHM for its application in different situations
Similar to PHM, this approach is essentially distribution free, thus it does not have to assume a specific form for the
baseline hazard function Regression coefficients are estimated using partial likelihood without the need of specifying the baseline hazard
function Explanatory variables have a multiplicative effect (rather than additive effect) on the baseline hazard function, thus it is
a more realistic and reasonable assumption
Similar to PHM, many goodness-of-fit tests and graphical methods are available for PSHM
Key limitations
Due to multicollinearity, estimated values of regression coefficients (𝛾) are sensitive to omission, misclassification and
time dependence of explanatory variables [10]
The estimated value of the regression coefficient is biased in the case of a small sample size
The main assumption of this model is that an asset/individual life is assumed to be terminated at the first failure time.
In other words, this model depends only on the time 𝑡 elapsed between the starting event (e.g. diagnosis) and the
terminal event (e.g. fail) and not on the chronological time 𝑡
2.1.3 Two-Step Regression Model
Anderson and Senthilselvan [51] extends Cox’s PHM to allow (approximately) for changing covariate effects in time and
presents a method of estimation. This model is applied and tested in the biomedical field. This extension is called two-step
regression model. Anderson and Senthilselvan [51] also introduces a method for smooth estimates of the hazard. In general, the
prediction effect of a covariate measured at a particular point of time (the beginning of a study) becomes progressively less
important as time goes by. If there is a good deal of information about a particular process, it may be possible to model this
directly and express 𝛽𝑗 as a specific function of time, such as 𝛽𝑗 𝑡 = 𝛽𝑗∗𝑒−𝛾𝑗 𝑡 , 𝑗 = 1, … , 𝑝. Two-step regression model allows
a very simple form of time-dependent for 𝛽𝑗 . It assumes that [51]:
𝛽𝑗 𝑡 = 𝜶𝒋 for 𝑡 ≤ 𝐵 and 𝛽𝑗 𝑡 = 𝜸𝒋 for 𝑡 > 𝐵 (4)
For some 𝐵 , which is not assumed known a priori. In most situations this step-function must be regarded as a first
approximation to the true form of 𝛽, as a function of time. The regression coefficients 𝜶𝑇 = (𝛼1, … , 𝛼𝑝) and 𝜸𝑇 = (𝛾1, … , 𝛾𝑝)
refer to short-term and long-term dependence of the hazard of 𝑧 [51]. It is noticeable that approximation implies roughly equal
rates of change for the 𝛽𝑗 (𝑡). The hazard of two-step regression model is [51, 52].
𝑡; 𝒛 = 0(𝑡) exp 𝜶𝑇𝒛 , 𝑡 ≤ 𝐵
1(𝑡) exp 𝜸𝑇𝒛 , 𝑡 > 𝐵
(5)
Anderson and Senthilselvan [51] proposes the conditional log-likelihood of Peto in Cox’s discussion paper [1] to estimate
regression coefficients. By maximising this log-likelihood regression coefficients will be estimated.
Key merits
Literature shows this model fits the data much better than Cox’s PHM [51, 52]
This model is more appropriate to deal with time-dependent covariates rather than Cox’s PHM
This model handles heavy censoring circumstance better than Cox’s PHM [51, 52]
Approximation of the model replaces a time-dependent coefficient by the step function, which is representing short-
term and long-term effects
This model can be extended to three-step regression function
Key limitations
This model has difficulty for estimating regression parameters due to large values of breakpoint 𝐵 [51]
This model has a common breakpoint for all covariates [51, 52]
2.1.4 Additive Hazard Model
The Additive Hazard Model (AHM) can be described as [53]:
𝑡; 𝒛 = 0 𝑡 + 𝜓(𝒛𝜸) (6)
Where, 0(𝑡) is the unspecified baseline hazard function which is dependent on time only and without influence of
covariates. The positive or negative functional term,𝜓(𝜸𝒛), is dependent on the effects of different factors, which have additive
effect (rather than multiplicative) on the baseline hazard function. This model provides the means for modelling a circumstance
when the hazard is not zero at time zero. Maximum likelihood procedures can be used to estimate this model’s parameters
[54].
Pijnenburg [53] tests this model in modelling the reliability of the air conditioning system of aircrafts. Newby [55] asserts
that AHM applications are restricted by an identifiability problem. Theoretical limitations leads to identification problems
while estimating parameters of the model [56]. Due to that this model is not identifiable, the observation of explanatory
variables does not add anything to the knowledge obtained from the event data [55].
Key merits
AHM is intuitively an attractive model when a system after repair is better than it was just before the repair, but not as
good as new
This model offers the hazard that is not zero at time zero
Key limitations
An additive assumption often leads to estimated hazard less than zero, as a result it is not a realistic and reasonable
assumption
This model cannot handle tied values (which have likelihood zero under the continuous random variable assumption)
The model cannot handle failure times equal to zero
This model can only be used in a phenomenological way to measure the magnitude of the jump 𝜓 in the hazard, and
models for the 𝜓𝑗 which makes use of explanatory variables are unlikely to produce satisfactory estimators of the
parameters [56]
2.1.5 Mixed (Additive-Multiplicative) Model
To enhance modelling capability about covariates, the mixed model considers the hazard of an asset/ individual, which
contains both a multiplicative and an additive component [57, 58]. The additive-multiplicative hazard model specifies that the
hazard for the counting process associates with a multidimensional covariate process 𝑍 = (𝑊𝑇 , 𝑋𝑇)𝑇 . Therefore, a general
additive-multiplicative hazard model takes the below form [59]:
(𝑡 𝑍) = 𝑔 𝛽0𝑇𝑊(𝑡) + 0(𝑡) 𝑓 𝛾0
𝑇𝑋(𝑡) (7)
Where, 𝜃0 = (𝛽0𝑇 , 𝛾0
𝑇)𝑇 is a vector of unknown regression coefficients, 𝑔 and 𝑓 are known link functions and 0 is an
unspecified baseline hazard function under 𝑔 = 0 and = 1 . Lin and Ying [59] develops a class of simple estimating functions
for 𝜃0, which contains the partial likelihood score function in the special case of proportional hazard model. The mixed model
is applied in the biomedical field [57].
Key merits
This model sometimes gives a better fit to the data rather than PHM [59]
This model allows covariates to have both the additive and multiplicative effects
Key limitation
Extremely limited testing
2.1.6 Accelerated Failure Time Model
The Accelerated Failure Time Model (AFTM) is one of the most common approaches used in obtaining reliability and
failure rate estimates of devices and components in a much shorter time [47, 60-62]. AFTM is to assume that the log
lifetime 𝑌 = log(𝑡), given applied stress vector 𝒛, has a distribution with a location parameter 𝜇(𝒛) and a constant scale
parameter 𝜎. AFTM can be expressed as [47, 62]:
𝑌 = log 𝑡 = 𝜇(𝒛) + 𝜎𝜖 (8)
Where, 𝜎 > 0 and 𝜖 is a random variable whose distribution does not depend on 𝒛. The hazard of AFTM can be written as
[8, 12]:
𝑡; 𝒛 = 0 𝑡 ∙ 𝜓(𝜸𝒛) ∙ 𝜓(𝜸𝒛) (9)
Where, 0(𝑡) denotes the baseline hazard function, and 𝜸 is a vector of regression coefficients. The effects of a covariate is
to accelerate or decelerate the failure time relative to the baseline hazard function according to 𝜓 𝜸𝒛 > 1 or 𝜓 𝜸𝒛 < 1 [60].
Maximum likelihood estimation can be used to estimate the parameters of the AFTM.
Key merit
AFTM is used to obtain reliability and failure rate estimates of assets and components in a much shorter time
Key limitation
AFTM is a time consuming process as well as is costly to set up this test
2.1.7 Extended Hazard Regression Model
The Extended Hazard Regression Model (EHRM) that includes PHM and AFTM is developed at 1985 [45, 46]. This model
assumes that a covariate vector changes the baseline hazard function, 0(𝑡), according to [45, 46]:
𝑡; 𝒛 = 𝑔1 𝜶𝒛 0(𝑔2 𝜷𝒛 𝑡) (10)
Where, 𝑔1(𝑥) and 𝑔2(𝑥) are positive functions equal to 1 at zero. 0(𝑡) denotes the baseline hazard function, and 𝜶 and 𝜷
are vectors of regression coefficients. For simplicity, it is assumed that 𝑔1 𝑥 = 𝑔2 𝑥 = exp(𝑥). The general case describes a
situation in which 𝒛 affects survival by changing both the time scale by the factor 𝑔2 𝜷𝒛 , and the scale in which the hazard is
measured by the factor 𝑔1 𝜶𝒛 𝑔2 𝜷𝒛 . This model reduces to the PHM for 𝜷 = 0 and to AFTM for 𝜶 = 𝜷. This model
describes a situation in which 𝒛 influences survival by changing both the time scale by the factor 𝑔2(𝜷𝒛), and the scale in
which the hazard is measured by the factor 𝑔1(𝜶𝒛) 𝑔2(𝜷𝒛) .Maximum likelihood function based on polynomial spline
approximation is developed to estimate all of the parameters of the model. This model is applied in both the biomedical [45,
46] and reliability fields [47].
Key merits
The appropriate applications of this model are AFTM dependent on the failure data analysis and types of failure
mechanisms
EHRM is a general model for hazard which includes both PHM and AFTM
Key limitations
Maximum likelihood estimation has a restricted assumption due to choosing a quadratic splines in order to maintain
the number of parameters small
Maximum likelihood estimation for this model is based on approximation of 0(∙) by splines; however, a spline
function cannot guarantee that any point in 0 ∙ is always positive
2.1.8 Proportional Intensity Model
The Proportional Intensity Model (PIM) was first introduced by Cox [1]. PIM is similar to PHM, but with the underlying
failure mechanism following a stochastic point process rather than a probabilistic distribution [63]. PIM is used to model the
intensity process of failures and repairs of a repairable system which incorporates explanatory variables [54, 64]. Volk et al.
[65] introduces PIM for both non-repairable and repairable systems utilising historic failure data and corresponding diagnostic
measurements. PIM assumes that a system enters stratum 1 at 𝑡 = 0 and that it enters stratum 𝑖 immediately following the
𝑗 − 1 𝑡 failure, 𝑗 = 1,2, … , 𝑚 [3]. The classes are based on two time scales, namely the global time 𝑡 and the time from the
immediately preceding failure, 𝑡 − 𝑡𝑛(𝑡), respectively [66, 67]:
𝑡 𝑁 𝑡 , 𝒛(𝑡) = 0𝑗 𝑡 exp 𝒛(𝑡)𝜸𝑗
𝑡 𝑁 𝑡 , 𝒛(𝑡) = 0𝑗 𝑡 − 𝑡𝑛(𝑡) exp 𝒛(𝑡)𝜸𝑗
(11)
Where, 𝑁(𝑡) represents a random variable for the number of failure in (0, 𝑡], and 𝒛(𝑡) denotes the covariate process up to
time 𝑡. 𝑡 𝑁 𝑡 , 𝒛(𝑡) and 0𝑗 𝑡 are the intensity function and the baseline intensity function, respectively, and 𝜸𝑗 is the
regression coefficient for the 𝑗𝑡 stratum. The baseline intensity function can have three different forms: constant intensity,
log-linear intensity, and power-law intensity [68]. Maximum likelihood estimation is used to estimate regression coefficients.
This model is applied in the reliability field [31, 34, 64, 66].
Key merits
PHM assumes a system is renewed at failure, while PIM does not necessary make this assumption
This model is suitable for optimising maintenance and repair policy in a cost-effective manner
This model is simple, and directly related to the two most common models (PHM and AFTM) already in use, and
effective in discriminating between them by means of 𝑥2 statistics based on a few degrees of freedom only
Key limitation
If covariates are deleted from the model or measured with different level of precision, the proportionality is in general
destroyed
2.1.9 Proportional Odds Model
In 1980, McCullagh [69] generalises the idea of constant odds ratio to more than two samples by means of a regression
model which is termed as the Proportional Odds Model (POM). The theoretical basis for the model assumes that prognostics
factors have a multiplicative effect on the odds against survival beyond any given time [70]. POM can be expressed as [12, 69,
71]:
𝐹(𝑡; 𝒛)
1 − 𝐹(𝑡; 𝒛)=
𝐹0 𝑡
1 − 𝐹0 𝑡 𝜓(𝒛𝜸)
(12)
Where, 𝐹(𝑡; 𝒛) is the cumulative distribution function of the occurrence of events in group 𝑖 and 𝐹0(𝑡) is an underlying
unknown cumulative distribution function. The term 𝐹𝑖(𝑡) 1 − 𝐹𝑖(𝑡) is called proportional odds ratio. The full maximum
likelihood function of this model is derived by Bennett and McCullagh [69, 71]. POM is developed [69] and applied [70] in the
biomedical field.
Key merits
Hazard for separate groups of asset/individual converges with time
Due to hazard converges with time, it is useful model when the effects of covariates diminish or disappear with time
increases
Key limitation
It is necessary to use some time transformation of the failure times to estimate the parameters of the model
2.1.10 Proportional Covariate Model
The Proportional Covariate Model (PCM) is developed by Sun et al. [72] in 2006. PCM assumes that covariate of a system,
or a function of those covariates, are proportional to the hazard of the system. Sun et al. [72] and Sun and Ma [73] claim that
PCM is developed due to some shortcomings of PHM. The generic form of PCM is expressed as [72]:
𝑍𝑟 𝑡 = 𝐶 𝑡 (𝑡) (13)
Where, 𝑍𝑟(𝑡) is the covariate function which is usually time dependent. The variable 𝐶(𝑡) is the baseline covariate function
which is also usually time dependent. The function (𝑡) is the hazard of a system. Maximum likelihood estimation is used to
estimate parameters. Due to the novelty of PCM, this model is only tested by laboratory data in the reliability field.
Key merits
Baseline covariate function 𝐶(𝑡) can take into account both historical failure data and historical condition monitoring
data
Baseline covariate function can be updated according to newly observed failure data and covariates
PCM is used to update the hazard of a system
Key limitations
This model does not consider operating environment data. Thus, this model is not sensitive to severe environments (e.g.
high ambient temperature)
Unlike the authors’ claims as an advantage of the model, PCM requires historical failure data to establish the covariate
baseline
PCM has not been applied widely in literature as it is a relatively new model
2.2 Semi-Parametric Models
In parametric models, the form of degradation paths or distribution of degradation measure is specified and/or partially
specified [6, 74]. These models can be classified as:
Weibull proportional hazard model
Logistic regression model
Log-Logistic regression model
Aalen’s regression model
2.2.1 Weibull Proportional Hazard Model
The Weibull Proportional Hazard Model (WPHM) is a special case of PHM when the Weibull distribution is assumed for
the failure times. Thus, in this model the baseline hazard function is Weibull distribution [33, 75, 76]. This model, as employed
by Jardine [21, 26], introduces a new concept in reliability of utilising diagnostic factors (condition monitoring data) as
explanatory variables. This model is expressed as [21, 77]:
𝑡; 𝒛(𝑡) =𝛽
𝜂 𝑡
𝜂
𝛽−1
𝜓(𝜸𝒛(𝑡))
(14)
Jardine et al. [21] derives a new likelihood function. By maximising this likelihood function, all parameters (i.e. regression
coefficients and parameters of Weibull distribution in the baseline hazard function) are estimated. Moreover, EXAKT software
is developed based on WPHM by Jardine et al. [78, 79] in University of Toronto.
Key merits
WPHM is an influential technique which can be used to investigate the effects of various explanatory variables on the
life length of assets/individuals
Explanatory variables have a multiplicative effect (rather than additive effect) on the baseline hazard function, thus it is
a more realistic and reasonable assumption
The model handles truncated or non-truncated data
Key limitations
Due to multicollinearity, estimated values of regression coefficients (𝛾) are sensitive to omission, misclassification and
time dependence of explanatory variables [10]
The estimated value of the regression coefficient is biased in the case of a small sample size
Mixing different types of covariates in one model may cause some problems
The main assumption of this model is that an asset/individual life is assumed to be terminated at the first failure time,
in other words this model depends only on the time 𝑡 elapsed between the starting event (e.g. diagnosis) and the
terminal event (e.g. death) and not on the chronological time 𝑡
Unlike PHM, this model does assume a specified form for the baseline hazard function
Proportionality assumption imposes a severe limitation which that survival curves for assets/individuals with different
covariates can never cross [45-49]
2.2.2 Logistic Regression Model
The Logistic regression model is a special case of POM. Logistic regression model is usually adopted to relate the
probability of an event to a set of covariates [80]. This concept can be used in degradation analysis. If current degradation
features are 𝒛(𝑡), Liao [80] defines the odds ratio between the reliability function 𝑅(𝑡 𝒛(𝑡)) and the cumulative distribution
function as:
𝑅(𝑡 𝒛(𝑡))
1 − 𝑅(𝑡 𝒛(𝑡)) = exp(𝛼 + 𝜸𝒛 𝑡 )
(15)
Where 𝛼 > 0 and 𝜸 are the model parameters to be estimated. Therefore, the reliability function can be expressed as:
𝑅(𝑡 𝒛(𝑡)) =exp(𝛼 + 𝜸𝒛 𝑡 )
1 + exp(𝛼 + 𝜸𝒛 𝑡 )
(16)
Liao [80] asserts that maximum likelihood function for the model parameters can be obtained by maximising the log-
likelihood function using the Nelder-Mead’s algorithm. This model is applied in reliability analysis [80].
Key merit
Based on its likelihood function, it needs less computation effort to estimate parameters rather than PHM [80]
Key limitations
Unlike POM, this model assumes a specified distribution
To estimate parameters and evaluate the reliability function, this model takes into account only the current covariates,
whereas PHM the entire history ones [80]
2.2.3 Log-Logistic Regression Model
The Log-Logistic regression model is a special case of POM when a Log-Logistic distribution is assumed for the failure
times [3]. The Log-Logistic regression model is described in which the hazard for separate samples converges with time.
Therefore, this provides a linear model for the log odds on survival by any chosen time. This model is developed to overcome
some shortcomings of Weibull distribution in the modelling of failure time data.
The distribution used frequently in the modelling of survival and failure time data is the Weibull distribution. However, its
application is limited by the fact that its hazard, while may be increasing or decreasing, must be monotonic, whatever the
values of its parameters. Bennett [71] claims that Weibull distribution may be inappropriate where the course of the failure
(e.g. disease in individuals) is such that mortality reaches a peak after some finite period, and then slowly declines. The hazard
of Log-Logistic regression model is [71]:
𝑡; 𝒛 =𝛿
𝑡 1 + 𝑡−𝛿 exp(−𝒛𝜸)
(17)
Which has its maximum value at 𝑡 = 1 − 𝛿 exp( −𝒛𝜸) 1 𝛿 . Where, 𝛿 is a measure of precision and 𝜸 is a measure of
location. The hazard is assumed to be increasing first and then decreasing with a change at the time. Parameters of this model
are estimated by maximising the likelihood function [71]. This model is tested and applied in biomedical [71]. The ratio of the
hazard for a covariate 𝒛 taking two values 𝑧1 and 𝑧2, which converges to unity as 𝑡 increases, is given by [71]:
(𝑡; 𝑧1)
(𝑡; 𝑧2)=
1 + 𝑡−𝛿 exp(−𝛾1𝑧2)
1 + 𝑡−𝛿 exp(−𝛾2𝑧1)
(18)
Key merits
It is more suitable to apply in the analysis of survival data rather than Log-Normal distribution
It is suitable model where hazard reaches a peak after some finite period, and the slowly declines
It has mathematical tractability when dealing with the censored observations
The hazard for different samples is not proportional through time as in the Weibull model, but that their ratio trends to
unity. Thus, this property is desirable when the initial effects of covariates (e.g. treatment) trend to diminish with time,
and the survival probabilities of different groups of asset/ individual become more similar
Key limitation
Unlike POM, this model assumes a specified distribution
2.2.4 Aalen’s Regression Model
The Aalen’s regression model is introduced by Aalen in 1980 [81]. This model is based on Aalen’s multiplicative intensity
model for counting processes [82] to assess additive time-dependent covariate effects in possibly right-censored survival data.
At first glance, this model seems to be non-parametric due to the absence of specified distribution. In this model linearity
represents a kind of distributional assumption; however, no finite-dimensional parameter is introduced in the model [82].
Therefore, in this study, it is classified as a semi-parametric model. Aalen develops this model in order to improve some
restrictions of Cox’s PHM. In his model, 𝑖(𝑡) denotes the intensity of the event happening at time 𝑡 for the 𝑖𝑡
asset/individual, (𝑖 𝑡 dt is the probability that the event occurs in some small time interval between 𝑡 and 𝑡 + 𝑑𝑡 given that it
has not happened before). 𝑛 is the number of assets/individuals and 𝑟 is the number of covariates in the analysis. Aalen [82]
considers the following linear model for the vector,𝒉(𝑡; 𝒛), of intensities 𝑖 𝑡 , 𝑖 = 1,2, … , 𝑛:
𝒉 𝑡; 𝒛 = 𝒀 𝑡 𝜸(𝑡) 0 ≤ 𝑡 ≤ 1 (19)
The 𝑛 × (𝑟 + 1) matrix 𝒀(𝑡) is constructed as follows: if the 𝑖𝑡 asset/individual is a member of the risk set at time 𝑡 then
the 𝑖𝑡 row of 𝒀(𝑡) is the vector 𝒛𝑖 𝑡 = (1, 𝑧1𝑖 𝑡 , 𝑧2
𝑖 𝑡 , … , 𝑧𝑟𝑖 𝑡 )′, where 𝑧𝑗
𝑖 𝑡 , 𝑗 = 1,2, … , 𝑟, are time-dependent covariate
values. If the 𝑖𝑡 asset/individual is not in the risk set at time 𝑡, thus the corresponding row of 𝒀(𝑡) contains only zeros. The
first element of the vector 𝜸 𝑡 , 𝛾0(𝑡), is interpreted as a baseline parameter function, while the remaining elements, 𝛾𝑖 𝑡 , 𝑖 =1,2, … , 𝑟, are called regression coefficients, which measure the influence of the respective covariates. Aalen’s regression model
explicitly allows for contributions of the covariates that change over time, since the regression functions may vary arbitrary
with time [52]. Consequently, if the effect of covariates is zero, the slope will also be zero. If the covariate has a constant
influence over time, the plot will be approximately a straight line. If the slope is positive (or negative), it shows that the effect
of covariates is to increase (or decrease) the hazard. If the plot is a curve with an increasing (or decreasing) slope this indicates
an increase (or decrease) in the magnitude of influence of covariates [3]. Aalen’s regression model has been applied in
biomedical [52, 82, 83] and reliability [44] fields. The merits and limitations of the model are illustrated in the following table.
Key merits
The main advantage of this model is its linearity
It is less vulnerable approach than Cox’s PHM to problem of inconsistency when covariates are deleted or the
precision of covariate measurements is changed [82]
If a covariate which is independent of the other covariates is removed from the model, then the new model is still linear
with unchanged regression coefficients for the remaining covariates, only the baseline parameter function is affected
This model shows the influence of time-dependent covariates better than Cox’s PHM [82]
Key limitations
If a covariate which is not independent of the other covariates is removed from the model, then it is necessary to
assume a multivariate normal distribution for the covariates in order to the new model stay linear, thus normal
assumption cannot be taken literally, since that might give positive probability to negative intensities
The obvious weakness of this model compared to Cox’s PHM is the expression for 𝑖(𝑡) does not restrict naturally to
non-negative values
The consequence of the lack of restriction to non-negative values is that the estimated survival function may not be
monotonically decreasing throughout, but can have occasional lapses where it increases slightly
Suitable methods of parameter estimation and its goodness-of-fit require to be investigated
3 CONCLUSIONS
The hazard model with covariates is one of the most common statistical models in reliability and survival analysis. This
expository paper reviews the existing literature on hazard models with covariates (termed covariate models) in both the
industrial and biomedical fields. This paper synthesises these models from both industrial reliability and biomedical fields and
then contextually groups them into: non-parametric and semi-parametric models. The merits and limitations of these models
have been discussed so as to establish suitable potential applications, especially for the information of fellow researchers.
This review paper demonstrates that all of these covariate models have been developed and then tested in the biomedical
field to estimate the survival time of patients or individuals. Only a few of these models appear in reliability literature and have
been applied in industrial cases. Most covariate models have not yet been effectively applied in reliability analysis. One reason
for this situation may be the lack of awareness of these covariate models by reliability engineers. Moreover, due to the
prominence of some models (e.g. PHM and WPHM), attempts to develop and apply alternative covariate models in the
reliability field have been somehow stifled. It may be beneficial to introduce and evaluate more covariate models to reliability
analysis for industrial cases.
4 REFERENCES
1 Cox DR. (1972) Regression models and life-tables. Royal Statistical Society, 34(2), 187-220.
2 Kumar D & KlefsjÖ B. (1994) Proportional hazards model: a review. Reliability Engineering & System Safety, 44(2),
177-188.
3 Kumar D & Westberg U. (1996) Some reliability models for analysing the effects of operating conditions.
International Journal of Reliability, Quality and Safety Engineering, 4(2), 133-148.
4 Ma L. (2007) Condition monitoring in engineering asset management. APVC. p. 16.
5 Ma Z & Krings AW. (2008) Survival analysis approach to reliability, survivability and prognostics and health
management. IEEE Aerospace Conference. pp. 1-20.
6 Liao H. (2004) Degradation models and design of accelerated degradation testing plans. United States -- New Jersey:
Rutgers The State University of New Jersey - New Brunswick.
7 Luxhoj JT & Shyur H-J. (1997) Comparison of proportional hazards models and neural networks for reliability
estimation. Intelligent Manufacturing, 8(3), 227-234.
8 Cox DR & Oakes D. (1984) Analysis of survival data. London ; ; New York: Chapman and Hall.
9 Crowder MJ. (1991) Statistical analysis of reliability data (1st ed.). London ; ; New York: Chapman & Hall.
10 Kumar D & Klefsjoe B. (1994) Proportional hazards model-an application to power supply cables of electric mine
loaders. International Journal of Reliability, Quality and Safety Engineering, 1(3), 337-352.
11 Landers TL & Kolarik WJ. (1986) Proportional hazards models and MIL-HDBK-217. Microelectronics and
Reliability, 26(4), 763-772.
12 Kalbfleisch JD & Prentice RL. (2002) The statistical analysis of failure time data (Second ed.). New Jersey: Wiley.
13 Cox DR. (1975) Partial likelihood. Biometrika, 62(2), 269-276.
14 Kalbfleisch JD & Prentice RL. (1973) Marginal likelihoods based on Cox's regression and life model. Biometrika,
60(2), 267-278.
15 Booker J, Campbell K, Goldman AG, Johnson ME & Bryson MC. (1981) Applications of Cox's proportional hazards
model to light water reactor component failure data (No. LA-8834-SR; Other: ON: DE81023991).
16 Breslow N. (1974) Covariance analysis of censored survival data. Biometrics, 30(1), 89-99.
17 Arjas E. (1988) A graphical method for assessing goodness of fit in Cox's proportional hazards model. American
Statistical Association, 83(401), 204-212.
18 Bendell A, Walley M, Wightman DW & Wood LM. (1986) Proportional hazards modelling in reliability analysis - an
application to brake discs on high speed trains. Quality and Reliability Engineering International, 2(1), 45-52.
19 Dale CJ. (1985) Application of the proportional hazards model in the reliability field. Reliability Engineering, 10(1),
1-14.
20 Ansell RO & Ansell JI. (1987) Modelling the reliability of sodium sulphur cells. Reliability engineering, 17(2), 127-
137.
21 Jardine AKS, Anderson PM & Mann DS. (1987) Application of the Weibull proportional hazads model to aircraft and
marine engine failure data. Quality and Reliability Engineering International, 3(2), 77-82.
22 Landers TL & Kolarik WJ. (1987) Proportional hazards analysis of field warranty data. Reliability engineering, 18(2),
131-139.
23 Baxter MJ, Bendell A, Manning PT & Ryan SG. (1988) Proportional hazards modelling of transmission equipment
failures. Reliability Engineering and System Safety, 21, 129–144.
24 Chan CK. (1990) A proportional hazards approach to correlate SiO2-breakdown voltage and time distributions. IEEE
Transactions on Reliability, 39(2), 147-150.
25 Drury MR, Walker EV, Wightman DW & Bendell A. (1988) Proportional hazards modelling in the analysis of
computer systems reliability. Reliability Engineering and System Safety, 21, 197-214.
26 Jardine AKS, Ralston P, Reid N & Stafford J. (1989) Proportional hazards analysis of diesel engine failure data.
Quality and Reliability Engineering International 5(3), 207-16.
27 Leitao ALF & Newton DW. (1989) Proportional hazards modelling of aircraft cargo door complaints. Quality and
Reliability Engineering International, 5(3), 229-238.
28 Mazzuchi TA & Soyer R. (1989) Assessment of machine tool reliability using a proportional hazards model. Naval
Research Logistics, 36(6), 765-777.
29 Mazzuchi TA, Soyer R & Spring RV. (1989) The proportional hazards model in reliability. Reliability and
Maintainability Symposium, 1989. Proceedings., Annual. pp. 252-256.
30 Elsayed EA & Chan CK. (1990) Estimation of thin-oxide reliability using proportional hazards models. IEEE
Transactions on Reliability 39(3), 329-335.
31 Kumar D. (1995) Proportional hazards modelling of repairable systems. Quality and Reliability Engineering
International, 11(5), 361-369.
32 Kumar D, KlefsjÖ B & Kumar U. (1992) Reliability analysis of power transmission cables of electric mine loaders
using the proportional hazards model. Reliability Engineering & System Safety, 37(3), 217-222.
33 Love CE & Guo R. (1991) Using proportional hazard modelling in plant maintenance. Quality and Reliability
Engineering International, 7(1), 7-17.
34 Love CE & Guo R. (1991) Application of weibull proportional hazards modelling to bad-as-old failure data. Quality
and Reliability Engineering International, 7(3), 149-157.
35 Pettitt AN & Daud IB. (1990) Investigating time dependence in Cox's proportional hazards model. Applied Statistics,
39(3), 313-329.
36 Gasmi S, Love CE & Kahle W. (2003) A general repair, proportional-hazards, framework to model complex
repairable systems. IEEE Transactions on Reliability, 52(1), 26-32.
37 Jardine AKS, Banjevic D, Wiseman M, Buck S & Joseph T. (2001) Optimizing a mine haul truck wheel motors'
condition monitoring program: Use of proportional hazards modeling. Quality in Maintenance Engineering, 7(4), 286.
38 Krivtsov VV, Tananko DE & Davis TP. (2002) Regression approach to tire reliability analysis. Reliability
Engineering & System Safety, 78(3), 267-273.
39 Martorell S, Sanchez A & Serradell V. (1999) Age-dependent reliability model considering effects of maintenance
and working conditions. Reliability Engineering & System Safety, 64(1), 19-31.
40 Park S. (2004) Identifying the hazard characteristics of pipes in water distribution systems by using the proportional
hazards model: 2. Applications. KSCE Journal of Civil Engineering, 8(6), 669-677.
41 Prasad PVN & Rao KRM. (2002) Reliability models of repairable systems considering the effect of operating
conditions. Annual Reliability and Maintainability Symposium pp. 503-510.
42 Prasad PVN & Rao KRM. (2003) Failure analysis and replacement strategies of distribution transformers using
proportional hazard modeling. Annual Reliability and Maintainability Symposium pp. 523-527.
43 Wallace JM, Mavris DN & Schrage DP. (2004) System reliability assessment using covariate theory. Annual
Symposium Reliability and Maintainability pp. 18-24.
44 Kumar D & Westberg U. (1996) Proportional hazards modeling of time-dependent covariates using linear regression:
a case study. IEEE Transactions on Reliability, 45(3), 386-392.
45 Ciampi A & Etezadi-Amoli J. (1985) A general model for testing the proportional hazards and the accelerated failure
time hypotheses in the analysis of censored survival data with covariates. Communications in Statistics - Theory and
Methods, 14(3), 651 - 667.
46 Etezadi-Amoli J & Ciampi A. (1987) Extended Hazard Regression for Censored Survival Data with Covariates: A
Spline Approximation for the Baseline Hazard Function. Biometrics, 43(1), 181-192.
47 Shyur H-J, Elsayed EA & Luxhøj JT. (1999) A general model for accelerated life testing with time-dependent
covariates. Naval Research Logistics, 46(3), 303-321.
48 Meeker WQ & LuValle MJ. (1995) An accelerated life test model based on reliability kinetics. Technometrics, 37(2),
133-146.
49 Lawless JF. (1986) A note on lifetime regression models. Biometrika, 73(2), 509-512.
50 Kay R. (1977) Proportional hazard regression models and the analysis of censored survival data. Applied Statistics,
26(3), 227-237.
51 Anderson JA & Senthilselvan A. (1982) A two-step regression model for hazard functions. Applied Statistics, 31(1),
44-51.
52 Mau J. (1986) On a graphical method for the detection of time-dependent effects of covariates in survival data.
Applied Statistics, 35(3), 245-255.
53 Pijnenburg M. (1991) Additive hazards models in repairable systems reliability. Reliability Engineering and System
Safety, 31(3), 369-390.
54 Wightman D & Bendell T. (1995) Comparison of proportional hazards modeling, additive hazards modeling and
proportional intensity modeling when applied to repairable system reliability. International Journal of Reliability,
Quality and Safety Engineering, 2(1), 23-34.
55 Newby M. (1994) Why no additive hazards models? IEEE Transactions on Reliability, 43(3), 484-488.
56 Newby M. (1992) A critical look at some point-process models for repairable systems. IMA Journal of Management
Mathematics, 4(4), 375-394.
57 Andersen PK & Væth M. (1989) Simple parametric and nonparametric models for excess and relative mortality.
Biometrics, 45(2), 523-535.
58 Badía FG, Berrade MD & Campos CA. (2002) Aging properties of the additive and proportional hazard mixing
models. Reliability Engineering & System Safety, 78(2), 165-172.
59 Lin DY & Ying Z. (1995) Semiparametric analysis of general additive-multiplicative hazard models for counting
processes. The Annals of Statistics, 23(5), 1712-1734.
60 Lin Z & Fei H. (1991) A nonparametric approach to progressive stress accelerated life testing. IEEE Transactions on
Reliability 40(2), 173-176.
61 Mettas A. (2000) Modeling and analysis for multiple stress-type accelerated life data. Annual Reliability and
Maintainability Symposium. pp. 138-143.
62 Newby M. (1988) Accelerated failure time models for reliability data analysis. Reliability Engineering & System
Safety, 20(3), 187-197.
63 Landers TL & Soroudi HE. (1991) Robustness of a semi-parametric proportional intensity model. IEEE Transactions
on Reliability, 40(2), 161-164.
64 Lugtigheid D, Jardine AKS & Jiang X. (2007) Optimizing the performance of a repairable system under a
maintenance and repair contract. Quality and Reliability Engineering International, 23(8), 943-960.
65 Vlok P-J, Wnek M & Zygmunt M. (2004) Utilising statistical residual life estimates of bearings to quantify the
influence of preventive maintenance actions. Mechanical Systems and Signal Processing, 18(4), 833-847.
66 Jiang S-T, Landers TL & Rhoads TR. (2006) Proportional intensity models robustness with overhaul intervals.
Quality and Reliability Engineering International, 22(3), 251-263.
67 Prentice RL, Williams BJ & Peterson AV. (1981) On the regression analysis of multivariate failure time data.
Biometrika, 68(2), 373-379.
68 Percy DF, Kobbacy KAH & Ascher HE. (1998) Using proportional-intensities models to schedule preventive-
maintenance intervals. IMA J Management Math, 9(3), 289-302.
69 McCullagh P. (1980) Regression models for ordinal data. Royal Statistical Society. Series B, 42(2), 109-142.
70 Bennett S. (1983) Analysis of survival data by the proportional odds model. Statistics in Medicine, 2(2), 273-277.
71 Bennett S. (1983) Log-logistic regression models for survival data. Applied Statistics, 32(2), 165-171.
72 Sun Y, Ma L, Mathew J, Wang W & Zhang S. (2006) Mechanical systems hazard estimation using condition
monitoring. Mechanical Systems and Signal Processing, 20(5), 1189-1201.
73 Sun Y & Ma L. (2007) Notes on "mechanical systems hazard estimation using condition monitoring"--Response to
the letter to the editor by Daming Lin and Murray Wiseman. Mechanical Systems and Signal Processing, 21(7), 2950-
2955.
74 Kobbacy KAH, Fawzi BB, Percy DF & Ascher HE. (1997) A full history proportional hazards model for preventive
maintenance scheduling. Quality and Reliability Engineering International, 13(4), 187-198.
75 Jardine AKS. (1983) Component and system replacement decisions. J.K S (Ed.). Image Sequence Processing and
Dynamic Scene Analysis. pp. 647-654.
76 Jozwiak IJ. (1997) An introduction to the studies of reliability of systems using the Weibull proportional hazards
model. Microelectronics and Reliability, 37(6), 915-918.
77 Jardine AKS & Anders M. (1985) Use of concomitant variables for reliability estimation. Maintenance Management
International, 5, 135-140.
78 Jardine AKS, Banjevic D & Makis V. (1997) Optimal replacement policy and the structure of software for condition-
based maintenance. Quality in Maintenance Engineering, 3(2), 1355-2511.
79 Jardine AKS, Joseph T & Banjevic D. (1999) Optimizing condition-based maintenance decisions for equipment
subject to vibration monitoring. Quality in Maintenance Engineering, 5(3), 192-202.
80 Liao H, Zhao W & Guo H. (2006) Predicting remaining useful life of an individual unit using proportional hazards
model and logistic regression model. Annual Reliability and Maintainability Symposium pp. 127-132.
81 McKeague IW. (1986) Estimation for a semimartingale regression model using the method of sieves. The Annals of
Statistics, 14(2), 579-589.
82 Aalen OO. (1989) A linear regression model for the analysis of life times. Statistics in Medicine 8(8), 907-925.
83 Mau J. (1988) A comparison of counting process models for complicated life histories. Applied Stochastic Models
and Data Analysis, 4, 283–298.
Acknowledgement
The authors gratefully acknowledge the financial support provided by both the Cooperative Research Centre for Integrated
Engineering Asset Management (CIEAM), established and supported under the Australian Government’s Cooperative
Research Centres Programme, and the School of Engineering Systems of Queensland University of Technology (QUT).