surviving survival forecasting of product failure · unit 1 1 mar 16 unit 3 29 mar 16 today time...

30
#AnalyticsX Copyright © 2016, SAS Institute Inc. All rights reserved. Surviving Survival Forecasting of Product Failure Ryan Carr Advisory Statistical Data Scientist SAS [email protected]

Upload: others

Post on 05-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Surviving Survival Forecasting of Product Failure · Unit 1 1 Mar 16 Unit 3 29 Mar 16 Today Time Aligned Unit 1 Week 0 Unit 3 Week 0 Week 20 Unit 1 was placed into service the first

#AnalyticsXC o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.

Surviving Survival Forecasting of Product Failure

Ryan CarrAdvisory Statistical Data ScientistSAS

[email protected]

Page 2: Surviving Survival Forecasting of Product Failure · Unit 1 1 Mar 16 Unit 3 29 Mar 16 Today Time Aligned Unit 1 Week 0 Unit 3 Week 0 Week 20 Unit 1 was placed into service the first

#analyticsx

C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.

Agenda

• Survival Model Concepts Censoring & time Alignment

Preparing the data for analysis

• Parametric Models Exponential

Log-linear

Weibull 2p

Weibull 3p

Generalized Gamma

• Process to Forecast % failure at fixed points from in-service date

• Process to Forecast weekly failure based on in-service dates

• Conclusion

Page 3: Surviving Survival Forecasting of Product Failure · Unit 1 1 Mar 16 Unit 3 29 Mar 16 Today Time Aligned Unit 1 Week 0 Unit 3 Week 0 Week 20 Unit 1 was placed into service the first

#analyticsx

C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.

Unit 1

Unit 21 Mar 16

1 Mar 16

Today

Right Censored

Unit 1 Failed after 16 weeks

The failure time for Unit 2 is considered “censored” since it did not fail during our study period

29 Jun 16

Survival Model Concepts

Page 4: Surviving Survival Forecasting of Product Failure · Unit 1 1 Mar 16 Unit 3 29 Mar 16 Today Time Aligned Unit 1 Week 0 Unit 3 Week 0 Week 20 Unit 1 was placed into service the first

#analyticsx

C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.

Unit 1

1 Mar 16

Unit 329 Mar 16 Today

Time Aligned

Unit 1

Week 0

Unit 3Week 0 Week 20

Unit 1 was placed into service the first week

Unit 2 was not placed into service until 4 weeks later

Time align each unit by using relative times(hours, days, weeks…) rather than absolute times.

Survival Model Concepts

Page 5: Surviving Survival Forecasting of Product Failure · Unit 1 1 Mar 16 Unit 3 29 Mar 16 Today Time Aligned Unit 1 Week 0 Unit 3 Week 0 Week 20 Unit 1 was placed into service the first

#analyticsx

C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.

• Preparing the dataIn Service Data Returns by Week FOR each in-service date

Two ways to get 1 week in service.

Survival Model Concepts

Page 6: Surviving Survival Forecasting of Product Failure · Unit 1 1 Mar 16 Unit 3 29 Mar 16 Today Time Aligned Unit 1 Week 0 Unit 3 Week 0 Week 20 Unit 1 was placed into service the first

#analyticsx

C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.

• Preparing the dataCensored / time Aligned Data

Censored “replacements” are actually units still in service

this row has only been in service 1 week before the end of the study.

These 44 units actually failed only 1 week after being placed in service.

It is the sum of any failing 1 week after any in-service date

Survival Model Concepts

Page 7: Surviving Survival Forecasting of Product Failure · Unit 1 1 Mar 16 Unit 3 29 Mar 16 Today Time Aligned Unit 1 Week 0 Unit 3 Week 0 Week 20 Unit 1 was placed into service the first

#analyticsx

C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.

• Preparing the dataForecast Template

For forecasting (after the model is fit) …

We focus back on only those units still in service.

Survival Model Concepts

Page 8: Surviving Survival Forecasting of Product Failure · Unit 1 1 Mar 16 Unit 3 29 Mar 16 Today Time Aligned Unit 1 Week 0 Unit 3 Week 0 Week 20 Unit 1 was placed into service the first

#analyticsx

C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.

Comparing Models

Exponential

Weibull 2p

Weibull 3p

Lognormal

Generalized Gamma

SOME of the Relationships among the distributions:

• Exponential is Weibull 2p with Scale=1

• Weibull 2p is Generalized Gamma with Shape=1

• Weibull 3p is Weibull 2p with an offset parameter

• LogNormal is Generalized Gamma with Shape=0

Distributions

NOTE: distribution information from https://en.wikipedia.org/wiki/Exponential_distribution

and https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_lifereg_sect019.htm

Page 9: Surviving Survival Forecasting of Product Failure · Unit 1 1 Mar 16 Unit 3 29 Mar 16 Today Time Aligned Unit 1 Week 0 Unit 3 Week 0 Week 20 Unit 1 was placed into service the first

#analyticsx

C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.

Comparing ModelsExponential

CDF

proc lifereg data=Returns_Censored outest=pe_Exponential ;

model WeeksInService*censor(1)= / distribution=exponential ;

weight replacements ;

output out=resid_exponential sres=sresiduals ;

probplot / hlower=.05 ;

inset ;

run;

𝐺 𝑡 = exp(−𝛼 𝑡 )

𝛼 = 10443

ods output ParameterEstimates = exp_pe2 ;

Page 10: Surviving Survival Forecasting of Product Failure · Unit 1 1 Mar 16 Unit 3 29 Mar 16 Today Time Aligned Unit 1 Week 0 Unit 3 Week 0 Week 20 Unit 1 was placed into service the first

#analyticsx

C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.

Comparing ModelsWeibull 2p

CDF 𝐺 𝑡 = exp(−𝛼 𝑡 𝛾)

𝛼 = 1800𝛾 = 1.374

proc lifereg data=Returns_Censored outest=pe_Weibull2p ;

model WeeksInService*censor(1)= / distribution=weibull ;

weight replacements ;

output out=resid_Weibull2p sres=sresiduals ;

probplot / hlower=0.05 ;

inset ;

run ;

ods output ParameterEstimates = w2p_pe2 ;

Page 11: Surviving Survival Forecasting of Product Failure · Unit 1 1 Mar 16 Unit 3 29 Mar 16 Today Time Aligned Unit 1 Week 0 Unit 3 Week 0 Week 20 Unit 1 was placed into service the first

#analyticsx

C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.

The Weibull 3p model is a generalization of the

Weibull 2p model where a “location” or offset

parameter is added.

This offset represents the “minimum time to event”.

Weibull 3p

CDF 𝐺 𝑡 = exp(−𝛼 𝑡 − 𝛿 𝛾)

Comparing Models

𝛼 = 3382𝛾 = 1.195𝛿 = 0.971

proc reliability data=Returns_Censored ;

freq replacements ;

distribution W3 ;

probplot WeeksInService*Censor(1) ;

run ;

ods output ParmEst = pe_W3P ;

Page 12: Surviving Survival Forecasting of Product Failure · Unit 1 1 Mar 16 Unit 3 29 Mar 16 Today Time Aligned Unit 1 Week 0 Unit 3 Week 0 Week 20 Unit 1 was placed into service the first

#analyticsx

C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.

LogNormal

Comparing ModelsCDF

𝜇 = 9.999𝜎 = 2.443

proc lifereg data=Returns_Censored outest=pe_LogNormal ;

model WeeksInService*censor(1)= / distribution=lognormal ;

weight replacements ;

output out=resid_LogNormal sres=sresiduals ;

probplot / hlower=0.05 ;

inset ;

run;

ods output ParameterEstimates = ln_pe2 ;

Page 13: Surviving Survival Forecasting of Product Failure · Unit 1 1 Mar 16 Unit 3 29 Mar 16 Today Time Aligned Unit 1 Week 0 Unit 3 Week 0 Week 20 Unit 1 was placed into service the first

#analyticsx

C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.

Generalized Gamma

CDF

Comparing Models

𝑎 = 9.838d= 18.788p= -13.351

proc lifereg data=Returns_Censored

inest=in_estw outest=pe_GGamma ;

model WeeksInService*censor(1)= / distribution=gamma ;

weight replacements ;

output out=resid_GGamma sres=sresiduals ;

probplot ;

inset ;

run;

ods output ParameterEstimates = gg_pe2 ;

Page 14: Surviving Survival Forecasting of Product Failure · Unit 1 1 Mar 16 Unit 3 29 Mar 16 Today Time Aligned Unit 1 Week 0 Unit 3 Week 0 Week 20 Unit 1 was placed into service the first

#analyticsx

C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.

Setting initial parameters

Generalized Gamma

proc lifereg data=Returns_Censored outest=out_estw noprint ;

model WeeksInService*censor(1)= / distribution=Weibull maxiter=5000 ;

weight replacements ;

run ;

data in_estw ;

set out_estw ;

_dist_ = "Gamma" ;

_shape1_ = 1 ; * Weibull 2p * ;

run ;

proc lifereg data=Returns_Censored

inest=in_estw outest=pe_GGamma ;

model WeeksInService*censor(1)= / distribution=gamma maxiter=10000 ;

weight replacements ;

output out=resid_GGamma sres=sresiduals ;

probplot ;

inset ;

run;

NOTE: The Generalized Gamma is a fairly complex distribution and may have convergence problems in maximum likelihood parameter estimation

Two steps to help with convergence are:1) Start parameter search at

a reasonable position like the Weibull 2p estimates

2) Set the maximum iterations to a higher number

Page 15: Surviving Survival Forecasting of Product Failure · Unit 1 1 Mar 16 Unit 3 29 Mar 16 Today Time Aligned Unit 1 Week 0 Unit 3 Week 0 Week 20 Unit 1 was placed into service the first

#analyticsx

C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.

Applying Models to forecasts Have models

Have parameter estimates

How do we apply these to get estimated future values?

Page 16: Surviving Survival Forecasting of Product Failure · Unit 1 1 Mar 16 Unit 3 29 Mar 16 Today Time Aligned Unit 1 Week 0 Unit 3 Week 0 Week 20 Unit 1 was placed into service the first

#analyticsx

C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.

Applying Models – Points in timeExponential

proc sql ;

select estimate

into :alpha

from exp_pe2

where parameter ='Weibull Scale' ;

run ;

data prob_failure ;

do WeeksInService = 4, 13, 26 ;

cdf = 1 - exp(- WeeksInService/&alpha.) ;

output ;

end ;

run ;

CDF 𝐺 𝑡 = exp(−𝛼 𝑡 )

𝛼 = 10443

0.25% Chance of failure by week 26

Direct Formula

ods output ParameterEstimates = exp_pe2 ;

Page 17: Surviving Survival Forecasting of Product Failure · Unit 1 1 Mar 16 Unit 3 29 Mar 16 Today Time Aligned Unit 1 Week 0 Unit 3 Week 0 Week 20 Unit 1 was placed into service the first

#analyticsx

C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.

Applying Models – Points in timeWeibull 2p

CDF

0.30% Chance of failure by week 26

𝐺 𝑡 = exp(−𝛼 𝑡 𝛾)𝛼 = 1800𝛾 = 1.374

proc sql ;

select put(estimate, 15.10) as estimate

into :alpha

from w2p_pe2

where parameter ='Weibull Shape' ;

select put(estimate, 15.10) as estimate

into :gamma

from w2p_pe2

where parameter ='Weibull Scale' ;

run ;

data prob_failure ;

do WeeksInService = 4, 13, 26 ;

cdf = (1- (exp(-((weeksInService)/&gamma.)**&alpha.)) ) ;

output ;

end ;

run ;

Direct Formula

Page 18: Surviving Survival Forecasting of Product Failure · Unit 1 1 Mar 16 Unit 3 29 Mar 16 Today Time Aligned Unit 1 Week 0 Unit 3 Week 0 Week 20 Unit 1 was placed into service the first

#analyticsx

C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.

Applying Models – Points in timeWeibull 3p

CDF

0.28% Chance of failure by week 26

𝐺 𝑡 = exp(−𝛼 𝑡 − 𝛿 𝛾)

𝛼 = 3382,𝛾 = 1.195,𝛿 = 0.971

select estimate

into :delta

from pe_W3P

where parameter ='Weibull Threshold' ;

data prob_failure ;

do WeeksInService = 4, 13, 26 ;

cdf = (1- (exp(-((weeksInService-&delta.)/&gamma.)**&alpha.)) ) ;

output ;

end ;

run ;

Direct Formula

Page 19: Surviving Survival Forecasting of Product Failure · Unit 1 1 Mar 16 Unit 3 29 Mar 16 Today Time Aligned Unit 1 Week 0 Unit 3 Week 0 Week 20 Unit 1 was placed into service the first

#analyticsx

C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.

proc sql ;

select intercept, _scale_ into :mu, :sigma

from pe_lognormal;

data prob_failure ;

do WeeksInService = 4, 13, 26 ;

cdf2 = cdf('lognormal', WeeksInService, &mu., &sigma.) ;

output ;

end ;

run ;

proc lifereg data=Returns_Censored outest=pe_LogNormal ;

...

Applying Models – Points in timeLogNormal

CDF

0.29% Chance of failure by week 26

𝜇 = 9.999, 𝜎 = 2.443

CDF Function

Page 20: Surviving Survival Forecasting of Product Failure · Unit 1 1 Mar 16 Unit 3 29 Mar 16 Today Time Aligned Unit 1 Week 0 Unit 3 Week 0 Week 20 Unit 1 was placed into service the first

#analyticsx

C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.

data pfail_in ;

censor = 0 ;

replacements = 1 ;

do WeeksInService = 4, 13, 26 ;

output ;

end ;

run ;

proc lifereg data=pfail_in inest=pe_GGamma noprint ;

model WeeksInService*censor(1)= / distribution=gamma maxiter=0 ;

output out=prob_failurel cdf=cdf ;

run ;

Applying Models – Points in timeGeneralized Gamma

CDF

0.25% Chance of failure by week 26

𝑎 = 9.838, d= 18.788, p= -13.351

Lifereg … maxiter=0

proc lifereg … outest=pe_GGamma ;

Page 21: Surviving Survival Forecasting of Product Failure · Unit 1 1 Mar 16 Unit 3 29 Mar 16 Today Time Aligned Unit 1 Week 0 Unit 3 Week 0 Week 20 Unit 1 was placed into service the first

#analyticsx

C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.

Applying Models – Weekly Returns

Generate periods / weeks for projection

Determine units still in service

from each source (ship week)

Apply models to get probability of failure each

week

For each source (ship week)

Apply weekly failure rates. Remove

units from service for next week

Predict next week’s failure

Align returns by source (ship week)

and summarize expected returns each future week.

Page 22: Surviving Survival Forecasting of Product Failure · Unit 1 1 Mar 16 Unit 3 29 Mar 16 Today Time Aligned Unit 1 Week 0 Unit 3 Week 0 Week 20 Unit 1 was placed into service the first

#analyticsx

C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.

ForecastIn

data forecastin ;

fcstrange = &numperiods. ;

do WeeksInService = 1 to fcstrange ;

replacements=1 ;

censor = 0 ;

output ;

end ;

run ;

Generate periods / weeks for projection

Applying Models – Weekly Returns

Page 23: Surviving Survival Forecasting of Product Failure · Unit 1 1 Mar 16 Unit 3 29 Mar 16 Today Time Aligned Unit 1 Week 0 Unit 3 Week 0 Week 20 Unit 1 was placed into service the first

#analyticsx

C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.

proc lifereg data=forecastin inest=pe_GGamma noprint ;

model WeeksInService*censor(1)= / distribution=gamma maxiter=0 ;

weight replacements ;

output out=predcdf cdf=cdf ;

run ;

Apply models to get probability of failure each

weekdata predpct ;

set predcdf (keep=WeeksInService CDF) ;

prevcdf = lag(cdf) ;

if _n_ = 1 then prevcdf = 0 ;

retn_pct = cdf - prevcdf ;

run ;

Applying Models – Weekly Returns

Page 24: Surviving Survival Forecasting of Product Failure · Unit 1 1 Mar 16 Unit 3 29 Mar 16 Today Time Aligned Unit 1 Week 0 Unit 3 Week 0 Week 20 Unit 1 was placed into service the first

#analyticsx

C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.

Determine units still in service

from each source (ship week)

Applying Models – Weekly Returnsproc sql ;

create table UnitsInService as

select ship_Week,

WeeksInService,

censor,

field_pop

from ShippedStillInField a

where censor = 1

group by WeeksInService, censor

;

run ;

Page 25: Surviving Survival Forecasting of Product Failure · Unit 1 1 Mar 16 Unit 3 29 Mar 16 Today Time Aligned Unit 1 Week 0 Unit 3 Week 0 Week 20 Unit 1 was placed into service the first

#analyticsx

C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.

For each source (ship week)

Apply weekly failure rates. Remove

units from service for next week

Predict next week’s failure

Applying Models – Weekly Returnsproc transpose data=predpct out=predpctt prefix=retnpct ;

var retn_pct ;

run ;

data fct ;

set UnitsInService ;

if _n_ = 1 then set predpctt (drop=_name_) ;

retain retnpct: ;

array retnpct(*) retnpct: ;

array forc[&numperiods.] ;

offset = WeeksInService - 1 ;

do i = 1 to (&numperiods.-offset) ;

forc[i] = round(field_pop * retnpct(i+offset), 1) ;

field_pop = field_pop - forc[i] ;

end ;

run ;

Page 26: Surviving Survival Forecasting of Product Failure · Unit 1 1 Mar 16 Unit 3 29 Mar 16 Today Time Aligned Unit 1 Week 0 Unit 3 Week 0 Week 20 Unit 1 was placed into service the first

#analyticsx

C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.

Align returns by source (ship week)

and summarize expected returns each future week.

data forecast;

merge shipsum fct_sort;

by descending WeeksInService ;

run ;

proc print data=forecast noobs label ;

id ship_week ;

var field_pop forc1-forc26 ;

sum field_pop forc1-forc26 ;

label field_pop="Units in Service" ;

format field_pop forc: comma9. ;

run ;

Applying Models – Weekly Returns

Page 27: Surviving Survival Forecasting of Product Failure · Unit 1 1 Mar 16 Unit 3 29 Mar 16 Today Time Aligned Unit 1 Week 0 Unit 3 Week 0 Week 20 Unit 1 was placed into service the first

#analyticsx

C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.

Comparing results with graphs

Applying Models – Weekly Returns

Generalized Gamma Weibull 3p

Page 28: Surviving Survival Forecasting of Product Failure · Unit 1 1 Mar 16 Unit 3 29 Mar 16 Today Time Aligned Unit 1 Week 0 Unit 3 Week 0 Week 20 Unit 1 was placed into service the first

#analyticsx

C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.

Distribution Formula CDF() LIFEREG

Exponential Yes Yes Yes

Weibull2p Yes Yes Yes

Weibull3p Yes Yes With mods

LogNormal Yes Yes

Generalized Gamma

Yes

NOTE: searching the internet for applications of the Generalized Gamma in SAS leads to many unanswered questions. The few answers I could find focused on implementation of the partial gamma function via SAS IML.The use of LIFEREG with maxiter=0 as a means of forecasting with an existing model and new data was not directly documented.

Applying Models – Summary

Page 29: Surviving Survival Forecasting of Product Failure · Unit 1 1 Mar 16 Unit 3 29 Mar 16 Today Time Aligned Unit 1 Week 0 Unit 3 Week 0 Week 20 Unit 1 was placed into service the first

#analyticsx

C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.

Conclusion – selecting model• Can structure comparisons leveraging

relationships of models and nested ML Lognormal, Exponential and Weibull 2p are all

instances of Generalized Gamma

But Weibull 3p is not?

• Could structure comparison using RMSE of actual vs predicted

• Ultimately Test against reality Understand “essentially, all models are wrong, but

some are useful” George E. P. Box.

• Use simplest projection method(s)

Page 30: Surviving Survival Forecasting of Product Failure · Unit 1 1 Mar 16 Unit 3 29 Mar 16 Today Time Aligned Unit 1 Week 0 Unit 3 Week 0 Week 20 Unit 1 was placed into service the first

C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.

#AnalyticsX