deploying doe to predict process performance

6/29/2021

Copyright © 2021 Stat‐Ease, Inc. Do not copy or redistribute in any form. 1

Deploying DOE to PredictProcess Performance

Shari Kraber – Statistical Consultant

Stat‐Ease, Inc

, Inc

Minnesota Quality ConferenceNovember 4 & 5, 2020

June 2021

Learning Objectives

In this session you will:

Learn about response surface (RSM) designs for predictions

Discover how to “right‐size” DOE’s for making predictions

Experience an RSM case study sized correctly, analyzed, optimized, and used for making predictions

2

6/29/2021


Agenda

3

Introduction to DOE for predictions

RSM Case Study

Size the design for predictions

RSM analysis

Optimization

Making predictions

Appendix – Comparing Common RSM Designs

4

Strategy of Experimentation

6/29/2021


Differences in DOE Objectives Factorial versus RSM

5

Factorials Response Surface Methods

Focus is on screening and characterization to identify main factor effects and interactions; respectively.

What are the important process factors?

For this purpose, power is an ideal metric to evaluate design suitability.

For optimization the emphasis shifts to fitting a response surface and making predictions.

How well does the surface represent true behavior?

To address this question, precision is a better measure of the experiment design.

Apply strategy of experimentation and DOE process all the way!

RSM Goal: Making Precise Predictions

Goal: Use RSM to fit a polynomial (relationship between factors and response) and then make predictions around the design space.

6

The location of the true optimum is unknown. Evaluate how much of the design space can predict the response with the desired precision.

6/29/2021


RSM Goal: Making Precise Predictions

The precision of predictiondepends on the location in the design space and thestandard deviation “s” of theresponse. For example, noticehow the confidence bands(dotted) vary over thisone‐factor response surface.

In this case, only about 50%of the fraction of design spacefalls within the desired range “d” (red).

To right‐size an RSM, do enough runs to achieve 80% FDS.With the aid of DX, FDS can be calculated on the basis of “d” & “s”.

7

The Inputs for Sizing via FDS

Precision (d): This is a business decision. How well do you want to make predictions? The more precision you want, the more data is required.

“We want to estimate the mean responsewith a precision (d) of +/‐ 0.80.”

Standard Deviation (s): This is the process standard deviation (including sampling and test variation). It is typically estimated from historical data, prior DOEs or other means. The greater the standard deviation, the more data is required.

Historical data displays a standard deviation of 0.50.

8

6/29/2021


Examples for “d” and “s”

9

ResponseDesired Precision

(d)*Standard

Deviation (s)**

Viscosity 𝑌 +/‐ 0.15 cp 0.12 cp

Chemical conversion 𝑌 +/‐ 5% 4%

Flex modulus 𝑌 +/‐ 4 psi 3.7 psi

Avg thickness 𝑌 +/‐ 4.5 mm 3 mm

* Precision – a business decision** Standard deviation – generally calculated from historical data

Standard Error Plot: Two‐factor FCD (1/2)

This plot shows the standard error of the predictions for a face‐centered central composite design (FCD). Note the squared‐off contours (a CCD produces ones that are circular).

10

6/29/2021


Standard Error (SE) Plot FDS (2/2)

The fraction of design space (FDS) graph provides a profile of the prediction error across the design space. In this case 25%—the inner core—falls within 0.43 and so on.

11

FDS Graph

Fraction of Design Space

StdErr

0.00 0.25 0.50 0.75 1.00

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

0.430.50

0.6125% of the design space falls within this region.

Making Precise Predictions (Example)Fraction of Design Space (FDS)

Specify the desired precision (d) of predictions, and the process std.dev. (s).

12

d = 0.8s = 0.5

FDS = .8989% of the design space predicts with a precision

of .80 or better.

6/29/2021


Sizing for PrecisionWhat Level of FDS is Good Enough?

How good is good enough? Rules of thumb:

For exploration want FDS ≥ 80%

For verification want FDS of 95‐100%!

What can be done to improve precision?

Manage expectations; i.e., increase d

Decrease noise; i.e., decrease s

Increase the number of runs in the design

13

Agenda

14


RSM Case Study


RSM analysis

Optimization

Making predictions


6/29/2021


Response Surface Method Case

This case study on a chemical process has two responses:

y1 ‐ Conversion (%)

y2 ‐ Activity

There are three process factors:

A ‐ time (minutes)

B ‐ temperature (degrees C)

C ‐ catalyst (percent)

Central composite design runs are conducted in two blocks:

1. 8 factorial points, plus 4 center points (12 runs total)

2. 6 star points, plus 2 center points (8 runs).

15

Response Surface Methods CaseRSM DOE Process (page 1 of 2)

1. Identify opportunity and define objective.

Maximize conversion to be >80

Find conditions that target Activity at 63 ± 3

2. State objective in terms of measurable responses.

Define the precision (of predictions) needed.

• predict average conversion ± 5.0%

• predict average activity ± 1.3

Estimate experimental error (s) for each response.

• sconversion ≈ 4.0

• sactivity ≈ 1.1

16

6/29/2021


Response Surface Methods CaseRSM DOE Process (page 2 of 2)

3. Select the input factors and ranges to study. (Consider both your region of interest and region of operability.)

40 to 50 minutes, 80° to 90°C, and 2 to 3% catalyst

4. Choose the polynomial to estimate. Quadratic

5. Select a design (Central Composite) and:

Size design for precision needed.

Examine the design layout to ensure all the factor combinations are safe to run and are likely to result in meaningful information (no disasters).

17

Agenda

18


RSM Case Study


RSM analysis

Optimization

Making predictions


6/29/2021


Response Surface Methods CaseEvaluate Design

Check precision:

Define the precision needed.

• predict average conversion ± 5.0%; d = 5

• predict average activity ± 1.3; d = 1.3

Estimate experimental error (s) for each response.

• sconversion ≈ 4.0 d/s = 1.25

• sactivity ≈ 1.1 d/s = 1.18*

*check this one ‐ worst case

19

Response Surface Methods CaseEvaluate Design (FDS)

20

84% of the design space will predict as accurately as we want.

FDS = 0.84

6/29/2021


Response Surface Methods CaseEvaluate Design

What if FDS is too low? (i.e., <80%)

Increase the +/‐ d: can you accept less precision?

Reduce the error in the process

Increase the number of runs – add 4‐5 runs and see if that helps

21

Agenda

22


RSM Case Study


RSM analysis

Optimization

Making predictions


6/29/2021


Response Surface Methods CaseANOVA for Quadratic Model

Source Sum of Squares df Mean Square F‐value p‐value

Block 64.53 1 64.53

Model 2561.82 9 284.65 16.87 0.0001 significant

A‐time 14.44 1 14.44 0.8561 0.3790

B‐temperature 222.96 1 222.96 13.21 0.0054

C‐catalyst 525.64 1 525.64 31.15 0.0003

AB 36.13 1 36.13 2.14 0.1774

AC 1035.12 1 1035.12 61.35 < 0.0001

BC 120.12 1 120.12 7.12 0.0257

A² 51.76 1 51.76 3.07 0.1138

B² 119.19 1 119.19 7.06 0.0261

C² 397.61 1 397.61 23.57 0.0009

Residual 151.85 9 16.87

Lack of Fit 46.60 5 9.32 0.3542 0.8574 not significant

Pure Error 105.25 4 26.31

Cor Total 2778.20 19

23

Response Surface Methods CasePost‐ANOVA (Fit Statistics)

Adjusted R2: the amount of variation in the data that is explained by the model (adjusted to prevent putting too many terms in the model, i.e., over‐fitting)

Predicted R2: the amount of variation in predictions that is explained by the model

24

Std. Dev. 4.11 R² 0.9440

Mean 78.30 Adjusted R² 0.8881

C.V. % 5.25 Predicted R² 0.7891

Adeq Precision 16.2944

6/29/2021


Response Surface Methods Case Residual Analysis

25

Model(Predicted Values)

Signalˆ iy

Data(Observed Values)

Signal + Noiseiy

Analysis

Filter Signal

Residuals(Observed ‐ Predicted)

Noiseˆi i ie y y

Independent N(0,s2)

Response Surface Methods CaseDiagnostics: Normal Plot & Resid vs Pred

26

Passes ‘fat pencil’ test. No particular pattern.

Externally Studentized Residuals

Norm

al %

Prob

abilit

y

Normal Plot of Residuals

-2.00 -1.00 0.00 1.00 2.00 3.00

1

5102030

50

70809095

99

Predicted

Exter

nally

Stud

entiz

ed R

esidu

als

Residuals vs. Predicted

-6.00

-4.00

-2.00

0.00

2.00

4.00

6.00

40.0 50.0 60.0 70.0 80.0 90.0 100.0

4.33355

-4.33355

0

6/29/2021


Response Surface Methods CaseDiagnostics: Run Order & Pred vs Actual

27

No points out of boundsand no trends.

Run Number

Ext

erna

lly S

tude

ntiz

ed R

esid

uals

Residuals vs. Run

-6.00

-4.00

-2.00

0.00

2.00

4.00

6.00

1 4 7 10 13 16 19

4.33355

-4.33355

0

Actual

Pre

dict

ed

Predicted vs. Actual

40.0

50.0

60.0

70.0

80.0

90.0

100.0

40.0 50.0 60.0 70.0 80.0 90.0 100.0

Tight fit up and down the line.

Response Surface Methods CaseDiagnostics: Box‐Cox Plot

28

Transform “None.”

Design-Expert® Software

Conversion

Current Lambda = 1Best Lambda = 0.55CI for Lambda: (-1.42, 3.06)

Recommend transform:None(Lambda = 1)

Current Transform:None

Lambda

Ln(R

esidu

alSS)

Box-Cox Plot for Power Transforms

5

5.2

5.4

5.6

5.8

6

6.2

6.4

-3 -2 -1 0 1 2 3

5.43

6/29/2021


Response Surface Methods Case Model Graphs: 2D Contour & 3D Surface

29

Agenda

30


RSM Case Study


RSM analysis

Optimization

Making predictions


6/29/2021


First Step: Develop Good ModelsDon’t Over Interpret the Statistics!

Be sure the fitted surface adequately represents your process before you use it for optimization. Check for:

1. A significant model: Large F‐value with p<0.05.

2. Insignificant lack‐of‐fit: F‐value with p>0.10.

3. R‐squareds >0.5.

4. Well behaved residuals: Check diagnostic plots!

Let’s review the ANOVA tables for the two exerciseresponses to make sure they meet criteria 1‐3.

31

First Step: Develop Good Models Conversion

ANOVA for Quadratic model

32

SourceSum of Squares

dfMean Square

F‐value p‐value

Block 64.53 1 64.53

Model 2561.82 9 284.65 16.87 0.0001 significant

Residual 151.85 9 16.87


Pure Error 105.25 4 26.31

Cor Total 2778.20 19

Std. Dev. 4.11 R² 0.9440


C.V. % 5.25 Predicted R² 0.7891


6/29/2021


First Step: Develop Good Models Activity

ANOVA for Linear model

33

SourceSum of Squares

dfMean Square

F‐value p‐value

Block 0.3967 1 0.3967

Model 316.70 3 105.57 109.78 < 0.0001 significant

Residual 14.42 15 0.9617


Pure Error 3.65 4 0.9131

Cor Total 331.53 19

Std. Dev. 0.9806 R² 0.9564


C.V. % 1.63 Predicted R² 0.9202


Optimization Case StudySetting Goals: Conversion

Conversion must be 80 percent or higher, ideally 100 percent.

1. Click Conversion.

2. Set Goal to “maximize”.

3. Make the lower limit “80”.

4. S t r e t c h the upper limit to a perfect “100”.

34

6/29/2021


Optimization Case StudySetting Goals: Activity

Activity at 63 is the goal but anywhere from 60 to 66 is OK.

1. Click Activity.

2. Set Goal to “target” of “63”

3. Enter Limits:

Lower “60” and Upper “66”.

35

Optimization Case StudySolution for Multiple Responses: Ramps View

36

This provides a clear picture of where to set each factor to getmost desirable response levels.

6/29/2021


Optimization Case StudySolution for Multiple Responses: Report

Solutions are reported by desirability – mostto least, based on how well the specified goals are met.

The closer all goals are met, the higher the overall desirability will be.

37

Number time temperature catalyst Conversion Activity Desirability

1 47.018 90.000 2.684 91.317 63.000 0.752 Selected

2 47.038 90.000 2.680 91.316 63.000 0.752

3 47.001 90.000 2.688 91.322 63.004 0.752

4 47.105 90.000 2.667 91.304 63.000 0.752

5 46.925 90.000 2.701 91.303 63.000 0.752

6 47.139 90.000 2.661 91.291 63.000 0.751

7 47.214 90.000 2.646 91.250 63.000 0.750

8 46.782 90.000 2.729 91.224 63.000 0.749

9 46.752 90.000 2.735 91.198 63.000 0.748

10 47.412 90.000 2.609 91.049 63.000 0.743

11 47.104 89.997 2.655 91.209 62.946 0.742

12 46.173 90.000 2.842 90.116 62.986 0.710

13 46.320 80.000 2.931 87.392 63.000 0.608

14 46.355 80.000 2.925 87.390 63.000 0.608

15 46.386 80.000 2.919 87.385 63.000 0.608

16 46.440 80.000 2.909 87.370 63.001 0.607

17 46.164 80.000 2.961 87.349 63.000 0.606

18 46.541 80.000 2.889 87.310 63.000 0.605

19 45.978 80.000 2.997 87.188 63.000 0.600

Agenda

38


RSM Case Study


RSM analysis

Optimization

Making predictions


6/29/2021


Response Surface Method CaseCheck Precision at Optimum (page 1 of 2)

39

Factor Name Level Low Level High Level Std. Dev. Coding

A time 47.02 40.00 50.00 0.0000 Actual

B temperature 90.00 80.00 90.00 0.0000 Actual

C catalyst 2.68 2.00 3.00 0.0000 Actual

What precision is achieved? – Next slide…

Response Surface Method CaseCheck Precision at Optimum (page 2 of 2)

Desired precision:

• predict average conversion ± 5.0%

• predict average activity ± 1.3

Precision at optimum:

• predicted average conversion 91.3 ± 4.5%

• predicted average activity 63.0 ± 0.8

40

Solution 1 of 20 Response

Predicted Mean

Std Dev SE Mean95% CI low for

Mean95% CI high for Mean

Conversion 91.3173 4.10758 1.98493 86.8271 95.8075

Activity 63.0001 0.980643 0.376168 62.1983 63.8019

6/29/2021


Response Surface MethodologySummary

Changes DOE objective from “detecting effects” to “describing the relationship between the factors and responses” and “making predictions.”

Works best with only a handful of critical factors, those that survive the screening phases of the experimental program.

Produces a polynomial model which gives an approximation of the true response surface over a factor region.

Seeks the optimal settings for process factors so you can target, maximize, minimize, or stabilize the responses of interest.

41

Using RSM Designs for Precise PredictionsFDS Summary

Response surface designs are the perfect tool for making predictions, but they need to have enough runs.

Sizing for precise predictions requires:

Defining the desired precision (mean +/‐ d)

Estimating the standard deviation of the response

Given this information, the FDS graph tells the percent of the design space that will give predictions with that precision (or better).

The value of this tool increases as your signal to noise ratio decreases, especially with S/N ratios < 1.5.

42

6/29/2021


[email protected]

Agenda

44


RSM Case Study


RSM analysis

Optimization

Making predictions


6/29/2021


Recommended RSM Designs

Central Composite design

5 levels, robust to modifications, augment from factorial

Box Behnken design

3 levels, streamlined (fewer runs)

Optimal (custom) design

Customize model, add design constraints, flexible

45

Central Composite Designs

Description:

Robust five‐level design for fitting second‐orderresponse surfaces

Advantages:

Rotatable and orthogonal

Can be run in blocks and augmented from a factorial

Alpha (axial) positions can be modified

Drawbacks:

Axial points may fall outside area of operability

Limited to fitting no more than a quadratic model

46

6/29/2021


Central Composite DesignTemplate for 3 Factors

47

A B C

Factorial –1 –1 –1

points: 1 –1 –1

−1 1 –1

1 1 –1

–1 –1 1

1 –1 1

–1 1 1

1 1 1

Axial (star) – 0 0

points: 0 0

0 – 0

0 0

0 0 –0 0

Center 0 0 0

points: 0 0 0

0 0 0

0 0 0

0 0 0

0 0 0

– +

(alpha) is the coded distance from the center to the axial (star) points.

Choices for AlphaWatch Out for Axial Points Going Too Far Out

CCD Options (highlights):

Rotatable (default k< 6)

Ideal statistically, the alpha value increases as number of factors (k) goes up, pushing the axial points out too far.

Practical (default k > 5)

This alpha (= k¼, i.e., 4th root) preserves the advantage of pushing axials outside the box, but not too far.

Face centered (not advised for k>8)

Setting alpha at 1 is most convenient by making CCDs only 3 level, but it creates high variance inflation factors (VIFs).

48

6/29/2021


Box‐Behnken Designs

Description:

Efficient three‐level designs for fitting second‐order response surfaces for up to 21 factors.

Advantages:

Only 3 levels (vs 5 for CCD)

Can be blocked orthogonally (except for k=3)

Rotatable (for k= 4,7) or nearly so

Drawbacks:

Not as flexible as CCD which allows:

1. Factorial with center points (stop here if no curvature)

2. Second block of axial (star) points only if needed

49

Box‐Behnken DesignsPoint Layout (example k=3)

The geometry of the 3‐factor design involves 12 points lying on a sphere about the center (in this case at √2) with 5 replicates of the center point.

50

6/29/2021


Box‐Behnken DesignsDesign Matrix (example k=3)

A B C

–1 –1 0

+1 –1 0

–1 +1 0

+1 +1 0

–1 0 –1

+1 0 –1

–1 0 +1

+1 0 +1

0 –1 –1

0 +1 –1

0 –1 +1

0 +1 +1

*0 0 0

*We suggest 5 center points

51

Optimal (custom) Designs

Description:

Computer‐generated design to fit the selected model. Include lack of fit points and replicate points for robustness.

Advantages:

Customize the polynomial model

Add constraints to fit an irregular‐shaped design space

Good design properties if software defaults used

Drawbacks:

Different point layout each time design generated

You don’t control the number of levels

Points may be at inconvenient values (may need to round)

52

6/29/2021


Optimal DesignDesign‐Expert’s modified algorithm

1. Select a polynomial that you think is needed to get a decent approximation of the actual response surface.

Usually a quadratic.

2. Software will select design points for:

Model: To allow estimation of all coefficients.

Lack‐of‐fit: In‐between points test how well the model represents actual behavior in our region of interest.

Replicates: To estimate pure error.

53

Optimal DesignExamples – 2 factor, quadratic model

54

Design points differ, but both designs are good statistically.

6/29/2021


RSM Design Summary

Top DOE choices for RSM designs:

Central Composite: full, fractional, and MR5; alpha values can be modified.

Box‐Behnken: good alternative 3‐level design.

Optimal: most flexible design. Use for:

• designs with constraints

• designs with categoric or discrete factors

• models other than full quadratic

• to augment an existing design

Always choose a design that fits the problem!

Size for precision!

55

deploying doe to predict process performance

Documents