inte2e20152e0802

This article was downloaded by: [146.83.129.99] On: 02 November 2015, At: 04:47Publisher: Institute for Operations Research and the Management Sciences (INFORMS)INFORMS is located in Maryland, USA

Interfaces

Publication details, including instructions for authors and subscription information:http://pubsonline.informs.org

Statistical and Optimization Techniques for LaundryPortfolio Optimization at Procter & GambleNats Esquejo, Kevin Miller, Kevin Norwood, Ivan Oliveira, Rob Pratt, Ming Zhao

To cite this article:Nats Esquejo, Kevin Miller, Kevin Norwood, Ivan Oliveira, Rob Pratt, Ming Zhao (2015) Statistical and Optimization Techniquesfor Laundry Portfolio Optimization at Procter & Gamble. Interfaces 45(5):444-461. http://dx.doi.org/10.1287/inte.2015.0802

Full terms and conditions of use: http://pubsonline.informs.org/page/terms-and-conditions

This article may be used only for the purposes of research, teaching, and/or private study. Commercial useor systematic downloading (by robots or other automatic processes) is prohibited without explicit Publisherapproval, unless otherwise noted. For more information, contact [email protected].

The Publisher does not warrant or guarantee the article’s accuracy, completeness, merchantability, fitnessfor a particular purpose, or non-infringement. Descriptions of, or references to, products or publications, orinclusion of an advertisement in this article, neither constitutes nor implies a guarantee, endorsement, orsupport of claims made of that product, publication, or service.

Copyright © 2015, INFORMS

Please scroll down for article—it is on subsequent pages

INFORMS is the largest professional society in the world for professionals in the fields of operations research, managementscience, and analytics.For more information on INFORMS, its publications, membership, or meetings visit http://www.informs.org

http://pubsonline.informs.org

http://dx.doi.org/10.1287/inte.2015.0802

http://pubsonline.informs.org/page/terms-and-conditions

http://www.informs.org

Vol. 45, No. 5, September–October 2015, pp. 444–461ISSN 0092-2102 (print) � ISSN 1526-551X (online) http://dx.doi.org/10.1287/inte.2015.0802

© 2015 INFORMS

Statistical and Optimization Techniques forLaundry Portfolio Optimization at

Procter & GambleNats Esquejo

Procter & Gamble, Newcastle-Upon-Tyne NE27 0QW, United Kingdom,[email protected]

Kevin Miller, Kevin NorwoodProcter & Gamble, Cincinnati, Ohio 45202{[email protected], [email protected]}

Ivan Oliveira, Rob PrattSAS, Cary, North Carolina 27513

{[email protected], [email protected]}

Ming ZhaoDepartment of Decision and Information Sciences, Bauer College of Business, University of Houston,

Houston, Texas 77204, [email protected]

The Procter & Gamble (P&G) fabric-care business is a multibillion dollar organization that oversees a globalportfolio of products, including household brands such as Tide, Dash, and Gain. Production is impacted bya steady stream of reformulation modifications, imposed by new-product innovation and constantly changingmaterial supply conditions. In this paper, we describe the creation and application of a novel analytical frame-work that has helped P&G determine the ingredient levels and product and process architectures that enable thecompany to create some of the world’s best laundry products. Modeling cleaning performance and other keyproperties such as density required P&G to develop innovative quantitative techniques based on visual statisti-cal tools. It used advanced mathematical programming methods to address challenges that the manufacturingprocess imposed, product performance requirements, and physical constraints, which collectively result in ahard mixed-integer nonlinear (nonconvex) optimization problem. We describe how P&G applied our frameworkin its North American market to identify a strategy that improves the performance of its laundry products,provides targeted consumer benefits, and enables cost savings in the order of millions of dollars.

Keywords : pooling; blending; optimization; response surface; design of experiments.History : This paper was refereed.

Procter & Gamble (P&G) laundry products areglobal household brands that include Tide, Dash,

and Gain, and are offered in several physical productforms, including powders, liquids, pods, tablets, andbars. These products are manufactured in more than30 sites and sold in more than 150 countries world-wide. The design of laundry-product formulations(i.e., ingredient composition of chemical mixtures) hasbecome more complex over the years because of chal-lenges, such as product-portfolio expansion, rapidlychanging ingredient costs and availability, and increas-ing competitive activity. The pace of change is fast andincreasing.

Traditional formulation approaches involve simpli-fying the problem, hypothesizing a solution, physi-cally creating and testing prototypes, analyzing results,and iterating the results until various objectives aremet. Physical prototyping can be expensive and timeconsuming, resulting in slow and costly iterationcycles; as a result, these traditional approaches nolonger meet today’s needs.

P&G’s research and development organization isat the forefront of the development and adoption ofmodeling tools that enable the company to make betterdecisionsonproduct formulation,processing,andman-ufacturing. These include empirical, first-principles,

444

Dow

nloa

ded

from

info

rms.

org

by [

146.

83.1

29.9

9] o

n 02

Nov

embe

r 20

15, a

t 04:

47 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.

mailto:[email protected]






Esquejo et al.: Laundry Portfolio Optimization at P&GInterfaces 45(5), pp. 444–461, © 2015 INFORMS 445

and semi-empirical models that predict chemical reac-tions during manufacturing, in-use physical prop-erties of the product, technical performance ofthe product, and even consumer acceptance rates.These tools enable researchers to instantly predict aproduct’s physical properties and performance, inte-grate models, and balance production trade-offs usinga variety of predictive and prescriptive capabilities.Until recently, the complexity of laundry-formulationand manufacturing processes limited us to considerreformulating only a single product at a time; how-ever, breakthroughs in mathematical optimizationtechnology have made possible system-wide portfolioreformulation. This is critically important because itpermits us to model and optimize product differentia-tion within a portfolio and consider sharing commonmaterials within the manufacturing process. In thispaper, we present the scope of laundry-portfolio mod-eling and optimization at P&G, the creation of capa-bilities we developed to address this scope, and itsapplication to innovate P&G’s North American pow-der laundry portfolio.

Problem Definition and ChallengesThe P&G North American laundry detergent busi-ness comprises three product forms: powders, liquids,and pods. Powder detergents, which generate annualsales of several hundred million U.S. dollars, are acritical part of P&G’s North American business. Evenas we focus on powders as the primary application,the framework for these tools can (and must) be eas-ily extendable to other forms. Therefore, although wefocus on the powder problem in this paper, we notethat the liquid form is a simplified version of thisproblem.

Laundry-product formulation can occur in one ormore manufacturing sites to supply multiple markets.Identical product formulations are commonly madein three or four different sites to fulfill the demandof an entire region, such as Western Europe or NorthAmerica. Because each manufacturing site defines itsown set of products, the possibility exists that 80 per-cent of the products produced in two different man-ufacturing sites may coincide, whereas the remaining20 percent are small-volume formulations that onlyone site supplies. We typically refer to a portfolio of

products as a group of products with a common setof characteristics. In this paper, we define portfolio asthe set of formula-unique powder laundry detergentsmanufactured in our North American site.

Figure 1 illustrates the mixing architecture andproblem structure of the laundry detergent blendingprocess. A large portfolio of products is created froma relatively small number of intermediate batches (i.e.,1 to 8 in Figure 1); an intermediate batch, also calledan intermediate, is a mixture that is shared by var-ious finished products. Each product is created byblending a portion of its mixture from exactly oneintermediate batch with as many finishing additivesas required. Intermediates and finished products arechemical mixtures of one or more ingredients or fin-ishing additives. Ingredients or finishing additives aresourced in the form of chemical mixtures, which werefer to as premixes. The ingredient composition ofeach premix is given, whereas the proportion of pre-mixes to be combined to produce a desirable mixtureof ingredients must be specified (as a decision vari-able). Costs are specified at the premix level, whereasproduct properties are determined by the ingredientcomposition.

The goal of production is to minimize portfo-lio annual material spend across a network, whichcurrently includes about 40 products and up to40 ingredients; material costs typically account forapproximately 60 percent of the total cost of produc-tion. P&G imposes many constraints to ensure that itstargeted levels of quality and manufacturing feasibil-ity are achieved. These include requirements for stainremoval and whiteness performance, material bal-ance and density of intermediate batches and productmixtures, manufacturing site (i.e., plant) throughput,water content, and raw material usage. Decisions tobe made include: assignments of products to interme-diates, intermediate-proportion contributions to eachproduct, mixture compositions of intermediates, andadditive proportions in final products. In addition,for laundry detergent powders, intermediate batchesmust conform to unique evaporation rules that makethe problem more complex. Making the intermediatebatch requires mixing ingredient premixes in a slurry,and then evaporating the excess water to form a free-flowing powder, which is mixed with finishing addi-tives to create the final product.

Dow

nloa

ded

from

info

rms.

org

by [

146.

83.1

29.9

9] o

n 02

Nov

embe

r 20

15, a

t 04:

47 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.

Esquejo et al.: Laundry Portfolio Optimization at P&G446 Interfaces 45(5), pp. 444–461, © 2015 INFORMS

Figure 1: (Color online) The production of laundry detergent mixtures creates a blending process.

To formalize a solution to this complex problem, weseparate the analysis into two categories: predictivemodels and optimization. Predictive models are usedto quantify the various relationships within the system,and optimization incorporates these predictive mod-els into the mathematical formulation to determine theideal values of decision variables. Next, we providedetails about each component of the problem.

Predictive ModelsPredictive models are either empirical or semi-empirical in nature. Empirical models are third-order

Coffee SRI95.092.590.087.582.580.0

Variable 1

123456789

10111213141516

88.5681.9685.8479.6586.2094.0283.4685.2183.4891.8884.0088.9984.5884.5288.8481.62

1000

5000

1000

500

10050

1005050

10050

1005050

1000

100100100

00

50500

500

50

050

10000

10010050

10010050505000

50

Variable 2 Variable 3

100

0 20 40 60 80 100 100 80 60 40 20 0

80

60

Var

iabl

e 3



40

20

10080

6040

200 0

2040

6080

1000

100

80

60

Var

iabl

e 3

40

20

0

CoffeeSRI

Figure 2: (Color online) This experimental design for a stain-removal index (SRI) for a coffee stain ischaracterized by SRI coefficients (left) and can be visualized as response-surface models (right).

polynomial functions that capture the two perfor-mance qualities of a mixture: stain removal andwhiteness.

Empirical models for stain removal and whitenesswere created using experimental design procedures,an efficient means of model creation for controlledexperiments (Box et al. 2005, Kutner et al. 2004), usingJMP software for the design and analysis. Figure 2shows an example of an experimental design for athree-variable model for a coffee stain. The table inthe figure lists the set of 16 test treatments we ranand the associated coffee-stain response. The image to

Dow

nloa

ded

from

info

rms.

org

by [

146.

83.1

29.9

9] o

n 02

Nov

embe

r 20

15, a

t 04:

47 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


the right of the table is a graphical representation ofthe treatments, with the shading corresponding to thevalue of the stain-removal index (SRI) for coffee. Inthis example, higher SRI values (darker shading) aremore desirable.

Our experimental designs were based on i-optimalcriteria; such designs minimize average variance ofprediction over the region of experimentation (Goosand Jones 2011, Johnson et al. 2011). They also in-cluded 16 variables with all two-way and selectedthree-way interactions, producing third-order designswith approximately 300 model terms. These variablesincluded all the key cleaning ingredients and washconditions of interest (e.g., surfactants and wash tem-perature). We used this design procedure for all stainand whiteness models (approximately 60 responses).

We generated empirical data for the design bymaking laboratory-scale formula prototypes, whichwe physically tested in standardized wash proto-cols. The stain removal and whiteness procedures weused were similar to ASTM method D4265-14 (ASTMInternational 2014), which involves creating standardstain sets and characterizing their color before andafter wash (�E) using image analysis. Figure 3 illus-trates this procedure for stains for which measures areassigned to each of several standard technical stainsthat are washed together in a single experiment. Fig-ure 4 shows an example of a standard coffee stain thathas been processed with a given product test mixture(before and after wash).

With the experiments in the design completed,our next step was to evaluate and select response

Beforewash

Afterwash

Washer

Temperature = tHardness = hSoil level = s

Stain removal = Before wash color – After wash color–Before wash color

Figure 3: (Color online) A standard wash protocol is used when testing thestain-removal effectiveness of a mixture.

Stain before wash Stain after wash

Figure 4: (Color online) Stains are scanned before and after a wash exper-iment; in this example, we use a coffee stain.

models. We accomplished model selection for eachresponse using three stepwise regression techniques:P -value threshold, minimum-corrected Akaike infor-mation criterion, and minimum Bayesian informationcriterion; see Burnham and Anderson (2002, 2004),and Miller (1990) for a description of using Akaikeand Bayesian information criteria for model selec-tion. We used multivariate regression to quantify theselected models and validation metrics to determinethe best model for each response, and we conductedseveral levels of validation for each model to char-acterize prediction quality. One of the common tech-niques involved quantifying standard fitting diagnos-tics for the data set used for model creation. Thesemetrics include R square, R square adjusted, rootmean square error, lack-of-fit p-values, and other sim-ilar metrics. R square adjusted for the models rangedfrom 0.55 to 0.95 with an average of 0.84. We usedmodels with R square adjusted below 0.70 only ifour technology experts agreed that trends that themodel displayed were acceptable for business pur-poses. All models in this design were deemed accept-able. Figure 5 shows an example of these diagnosticsfor coffee SRI.

Semi-empirical models are based, to some extent,on physical relationships. In this paper, we definesemi-empirical models as functional forms derivedfrom known equations (typically based on physicallaws and theorems), whose coefficients were deter-mined by fitting the equation to a set of experimentaldata. Finished-product density and intermediate den-sity were the primary semi-empirical models used;both are nonconvex functions of mixture-ingredient

Dow

nloa

ded

from

info

rms.

org

by [

146.

83.1

29.9

9] o

n 02

Nov

embe

r 20

15, a

t 04:

47 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


Actual by predicted plot

Lack of fit

Sum ofsquares Mean square

Prob > F

Max r sq

F ratioDFSource

Analysis of variance

Sum ofsquares Mean square F ratioDFSource

Prob > F

*

Summary of fit

R a

P R uare

R a

Figure 5: (Color online) Fit diagnostics, shown for the SRI response func-tion for a coffee stain, show strong predictive model performance and aretypical of stain results.

proportions. The appendix includes detailed func-tional forms of these models.

Finally, we used a first-principles model of evapo-rative load for any given formulation to estimate theimpact on evaporative rate, an important manufactur-ing consideration for detergent powders. This model

is based on the material balance of water in the dryingprocess.

OptimizationThe predictive models described offer the capability todetermine cleaning and density properties of a givenarbitrary mixture of ingredients without resorting toexpensive and time-consuming physical prototypes.The next step was to determine ideal mixture ingredi-ent compositions to ultimately bring P&G products tothe market while meeting stringent quality and manu-facturing requirements. The optimization problem wedescribe determines the most economical composi-tion of mixtures, evaporation levels, and intermediate-to-product assignments. In this section, we describethe optimization problem at a high level to present abasic statement of the problem, highlight complicat-ing features, and motivate the solution-methodologydiscussion. We include a more detailed mathematicalformulation in the appendix.

Intermediate batches and final products consist ofmixtures of ingredients. Although ingredient propor-tions determine the properties of these mixtures, themanufacturing process does not simply blend pureingredients. Rather, premixes (mixtures of a smallsubset of ingredients) are sourced and mixed toachieve a final-ingredient mixture. The eligibility ofpremixes to be added to an intermediate batch versusa final product is determined primarily by the natureof the premix. Premixes with high water content aretypically added to intermediate batches so that thewater can evaporate. Some premixes (e.g., perfumes)can be unique to one product or available to all prod-ucts, either in the intermediate batch, postevaporationstage, or both.

Figure 6 illustrates the analytical representationof material flowing through various stages of themanufacturing process for a simple two-intermediate,three-product example. Individual premixes may beeligible to go into intermediate mixtures, directly intothe final-product mixtures (as additives), or either. Forexample, the figure shows that premix Pre1 can beassigned only to intermediates (B1 and B2), whereaspremix Pre3 can be assigned to intermediates andproduct Prod2. The additive Pre5 can be assignedonly to Prod3 directly.

Percentages by weight of premixes used in totalproduction of intermediate batches (B1 and B2) and

Dow

nloa

ded

from

info

rms.

org

by [

146.

83.1

29.9

9] o

n 02

Nov

embe

r 20

15, a

t 04:

47 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


Pre1

Premixes

Common intermediates

Pre2

Products

Prod1

Prod3

Prod2Pre3

Pre4

Pre5

B1 (pre)

B2 (pre)

B1 (post)

B2 (post)

e1

e2

Figure 6: The layout defines the various decision variables and constraintsof the optimization problem.

products (Prod1, Prod2, and Prod3) are determinedas part of the optimization problem. In turn, thesedecision variables determine mass percentages of rawmaterials in the total production of intermediate andproduct mixtures, as determined by the given premixcompositions. Because intermediates must adhere tostringent water-content requirements, an evaporationphase that produces a postevaporation mixture is nec-essary, as the distinction between pre- and postevap-orated intermediates in Figure 6 indicates.

The extent of evaporation of intermediate batchesis a control variable of the manufacturing process andmust be prescribed as a decision variable for eachintermediate batch; because it is accomplished usingspray drying technology, it is also subject to maxi-mum evaporation constraints based on plant restric-tions. The appendix provides mathematical detailsabout the role of this variable; in this discussion, wesimply point out that it introduces into the formula-tion a difficult bilinear term that cannot be eliminated.

Each product can derive a portion of its mixturefrom at most one postevaporated intermediate. Thisrequirement permits a continuous manufacturing pro-cess flow, which avoids capital investment for stor-age and handling of the intermediate batch; however,it limits the ability to channel various intermediatebatches to each product. The intermediate-to-productassignment is a binary decision variable.

In addition to these requirements, the followingconstraints must hold:

1. Ingredient and premix mass fractions must addup to 100 percent for all mixtures in the process.

2. Water content in the pre- and postevaporationintermediate mixtures must lie within a given range(typically between one percent and three percent masscontent for postevaporation).

3. Each ingredient’s mass fractions must lie withinpredefined bounds to ensure the integrity of the phys-ical models.

4. Finished products must have density valueswithin predefined bounds.

5. Finished product SRI and whiteness values mustbe bounded below by benchmark values.

6. A predefined number of intermediates must beused.

Items 1–3 are linear constraints and do not imposeundue computational burden on the solution pro-cess; however, items 4 and 5 enforce density, SRI, andwhiteness constraints, introducing additional nonlin-ear, nonconvex constraints to the formulation. Productdensity is expressed in terms of a rational function,and SRI and whiteness are third-order polynomialfunctions of the mixture-composition variables. Incombination with the previously described bilinearterm and binary decision variables, these featuresmake this a difficult mixed-integer nonlinear program-ming (MINLP) problem.

The optimization objective is to minimize the totalcost of premixes used in the mixing process, weightedby product dosage per stat unit (i.e., a unit of de-mand) and product production volume targets. Thisobjective represents the total cost of production. Theappendix provides a detailed mathematical formula-tion of the MINLP, including all sets, decision vari-ables, constraints, and objectives.

Optimization Solution MethodologyAs previously stated, a requirement of the problemis that a given number of intermediate batches mustbe used in the mixing process to produce a set ofproducts, where each product can take a portion ofits composition from exactly one intermediate. Anynumber of products can use an intermediate. We caninterpret this as a set-partitioning problem with non-linear side constraints; that is, we aim to partition

Dow

nloa

ded

from

info

rms.

org

by [

146.

83.1

29.9

9] o

n 02

Nov

embe

r 20

15, a

t 04:

47 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


the set of products into subsets, such that an inter-mediate is assigned to each subset of products thatmust satisfy various nonlinear constraints, such asperformance requirements. This interpretation of theproblem is closely related to the pooling problem(Bodington and Baker 1990), and can be shown tobe an NP-hard mixed-integer nonlinear (nonconvex)program.

Previous approaches to solving pooling problemsinclude relaxation and discretization strategies (Gupteet al. 2013), Benders decomposition (Floudas andAggarwal 1990), Lagrangian relaxation (Visweswaranand Floudas 1990), branch and cut (Audet et al. 2004),and mixed integer linear programming (MILP) (Deyand Gupte 2013). These approaches do not directlyapply to the variant we consider in this paper. Most ofthe effort to date has focused on addressing the bilin-ear terms in the problem; however, the evaporationprocess, the requirement that assignments must bebinary decisions, and nonconvex performance modelsviolate basic assumptions of much of the work con-ducted to date.

State-of-the-art solution methodologies for the pool-ing problem that attempt to prove optimality are typ-ically restricted to problems of (in equivalent terms) afew dozen ingredients, premixes, intermediates, andproducts; Dey and Gupte (2013) provide exampleinstance sizes, where the number of inputs and out-puts is less than 100. Several features of the prob-lem we address in this paper make it a significantlyharder generalization of the classical pooling prob-lem, although its size is comparable to instancesaddressed by the best-known methods. Furthermore,the best-known methods often take many hours toreach a small optimality-bound gap. For our pur-poses, solutions must be produced in an order of min-utes (to allow scenario exploration, which is critical toP&G formulation practices); therefore, we have takena heuristic approach to solve this optimization prob-lem. The appendix provides a detailed explanation ofthe optimization solution methodology.

Implementation and Usage

Team and WorkflowTo best utilize this new analytical capability, P&G hadto devise a new way of organizing our team and also

implement a new work process. The use of optimiza-tion tools requires a set of skills best defined by anoptimization triad (Figure 7) that includes functional,data, and optimization experts.

Functional experts are typically a small number ofindividuals from different R&D functions, includingconsumer, formulation, and process. Data experts areone or two individuals who have access to all thenecessary information, such as material pricing andmaterial balances, and are typically skilled in visualanalytics tools, such as JMP statistical tools. Optimiza-tion experts are staff members who have the neces-sary programming skills to interact with the optimiza-tion models at the SAS code level. Figure 7 definesthe responsibilities of expert in each category.

We defined a new work process (Figure 8), whichhas enabled this multifunctional, multiskilled teamto efficiently use the new capability; we describe thecomponents as follows.

1. Problem definition: Functional experts frame theproblem to be solved and the research questions to beaddressed. For example, we researched and analyzedprocesses that would (1) allow P&G to deliver signif-icantly better product performance at equal or lowercost over its current processes, and (2) reduce the num-ber of intermediates while maintaining performance.

2. Knowledge development: Functional expertsstart building the problem’s framework by collectingknowledge and building a common state of under-standing across the team. We define knowledge asexisting models, heuristic approaches, and the simpli-fication of assumptions, all of which must form a basisfor agreement. For example, if different density mod-els exist, the team must come to a consensus on which

Optimization expertFunctional expert

Data expert

Figure 7: (Color online) This figure shows the optimization triad that P&Gadopted as part of its standard process.

Dow

nloa

ded

from

info

rms.

org

by [

146.

83.1

29.9

9] o

n 02

Nov

embe

r 20

15, a

t 04:

47 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


Figure 8: (Color online) The optimization work process at P&G follows a sequence of nine interrelated steps.

density model to use. Additional data or analysis maybe required if the team cannot come to an agreementbecause of missing or conflicting information.

3. Model development: Functional experts developmodels for new-product parameters or adapt (e.g.,linearize) current models to make them more suitablefor optimization.

4. Portfolio data gathering: Data experts collect andformat all the information needed for the framework.This proved to be a challenging step because weencountered multiple databases with missing rela-tional keys. The process can be quite manual in somecases.

5. Optimization problem formulation: Optimiza-tion experts interpret the research question(s) and,using the knowledge collected, models, and portfolioinformation as context, create or adapt the mathemat-ical framework for optimization by defining or mod-ifying variables, objective functions, and constraints.Optimization experts start the modeling work as soonas the research question has been defined, and furtherrefine the optimization models as additional knowl-edge is generated and the portfolio data are compiled.

6. Write optimization code: Optimization expertswrite or adapt SAS/OR code as needed to address the

new research question, constantly verifying assump-tions and validating preliminary results with otherexperts and stakeholders.

7. Iterative testing and mathematical validation:Optimization experts participate with other membersof the team in a cycle of iterative testing to mathemat-ically validate models, ensuring that constraints arecorrectly interpreted and observed and that optimumsolutions are robust. In this step, parameters (e.g.,the number of multistart points and stopping criteria)are tuned, and model infeasibilities, often caused byproducts with too-stringent performance constraints,are commonly found. At this stage, steps must betaken to ensure feasibility; revisiting the constraintbounds is an example.

8. Churn and analysis: In this step, the entire teamexercises the optimization engine to optimize multiplescenarios to answer the research question(s). Scenarioscan include changes in performance requirements, thenumber of intermediates allowed, constraint bounds,materials allowed to be used (and where they can beused), or a combination of these scenarios. For highlycomplex research questions, P&G implements a churnevent; in such an event, all members of the team arecolocated for two to four days. They focus all their

Dow

nloa

ded

from

info

rms.

org

by [

146.

83.1

29.9

9] o

n 02

Nov

embe

r 20

15, a

t 04:

47 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


time on a common set of problems, and their objec-tive is to produce data to inform decision making.During churn, the team occupies a common roomequipped with physical and digital visualization toolsthat aid the work process. Poster-sized printouts dis-play a master list of all scenarios, and key parame-ters are captured and color coded to differentiate thescenarios. They analyze completed scenarios on a dis-play that consists of eight 42" high-definition televi-sion screens configured to behave as a single monitor,allowing high-resolution visualizations to be spreadacross a large area. The team typically reviews theresults using JMP software, which permits interactivevisualization and analysis.

9. Optimized recommendation: Functional expertstake results from the analysis and formulate a recom-mendation. Recommendations can be as simple as anew set of formulations to meet a new requirement,or as complex as a multistage strategy for a portfolioof products, which might include the introduction ofnew technologies.

In the course of this project, we learned that a mind-set shift is required to solve for the entire portfoliorather than for one product at a time. Traditionally,projects have addressed formulation changes for oneor only a small subset of products, while preserv-ing the composition of most products in the portfo-lio. Once this mind-set shift happened, however, wewere able to identify (and test) portfolio-managementstrategies, which led to some unique options thatP&G had not previously uncovered.

Outputs and ResultsBecause the algorithm we present in this paper isbased on building an approximation of the power setof products, each optimization can be easily extendedto solve for various numbers of intermediate batchesby solving the selection step for as many indepen-dent subproblems as there are products—from oneintermediate for all products to the singleton cases.We exploited this aspect of the algorithm to pro-vide a range of intermediate batch configurations foreach optimization run for marginal additional runtime, running the set-covering subproblems in par-allel. When running in this mode, one must deter-mine where to apply the stopping criterion. BecauseP&G’s production focus is on number of interme-diates greater than or equal to four, we apply our

stopping criterion at three intermediates at an esti-mated gap of three percent for all optimization runs.

Figure 9 shows solution objective values (total costdifference for annual production) of a typical optimi-zation, which we represent as the difference betweenour solution output objective and the annual cost ofa historical production run. An obvious feature of theproblem we consider here is that as the number ofintermediate batches increases, the globally optimalobjective must not increase, because the model allowsmore flexibility in tuning each product’s assignedintermediate. The figure reflects this, and we note thatthe algorithm’s flow was motivated to ensure that thischeck to verify rationality would never be violateddespite the presence of nonconvexity.

Reporting such a multiple intermediate solutionproved valuable to management because an immedi-ate comparison of neighboring solutions could showthe benefits of increasing (or decreasing) the numberof intermediate batches in production—a costly, butsometimes beneficial, manufacturing investment. Forexample, the marginal benefits of increasing the num-ber of intermediate batches from 14 to 17 are minimal;only at 18 intermediates does a significant changeoccur (because of the discrete nature of the problem),which may justify the added complexity and cost.

Figure 9: (Color online) Multiple intermediate-solution annual costs showa potential reduction of $4 million when the number of intermediatebatches is held constant for a 14-batch instance. An alternative interpre-tation suggests that, at similar costs and levels of quality, production canoccur with only six intermediate batches.

Dow

nloa

ded

from

info

rms.

org

by [

146.

83.1

29.9

9] o

n 02

Nov

embe

r 20

15, a

t 04:

47 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


As we mention previously, each optimization in-stance was referenced against a related historicalbenchmark production run (created without the ben-efits of an analytical approach). Our expectation forthe success of this project was that the solution pro-vided by the framework would always improve onthe benchmark’s premix annual costs. This is a rea-sonable expectation because, for a given number ofintermediate batches, the benchmark is typically afeasible mixture. Comparing the 14-intermediate solu-tion annual cost with the benchmark value in Figure 9shows that this requirement was met in this exam-ple, and that the solution typically produces savingsin the order of magnitude shown (between $0.5 and$6 million).

Few instances currently exist at P&G because sta-tistical and semi-empirical models have only recentlybeen developed for entire product portfolios. Table 1lists two important powder instances we used inthis work and also shows a number of ingredi-ents, premixes, products, response-surface models,and benchmark intermediates that represent typicalproblem sizes.

The optimization tool produces significantly bet-ter results than the benchmark for both instances inthe table; its run times are very reasonable, given theneeds of the P&G work process. Tables 2 and 3 sum-marize the results for these instances, listing mostintermediate configurations from three to the num-ber of products in the instance. The tables show costdifferences from the benchmark (in millions) and runtimes at which each intermediate configuration satis-fied its stopping criterion. The longest run time is thetotal run time of the process. Note the cost improve-ments when compared to the benchmark even whenrunning at the minimum number of intermediatebatches (three) for both instances. The values in the

Instance name Ingredients Premixes Products RS Intermediates

3bii 54 38 21 50 12Wenlock 42 34 25 38 13

Table 1: This table summarizes the size of two representative instancesin P&G’s portfolio. The term RS refers to the number of response-surfacemodels, and the Intermediates column represents the number of interme-diate batches in the benchmark solution.

Cost diff. vs. benchIntermediates ($ in millions) Bounds (%) Run time (sec)

3 −1508 2035 458054 −1609 1077 253005 −1800 1019 110026 −1808 0078 110047 −1902 0058 110058 −1904 0045 110069 −1906 0036 11006

10 −1907 0029 1100711 −1908 0022 1100912 −2000 0015 1110013 −2001 0009 1110114 −2002 0004 1110221 −2003 0000 3206

Table 2: The table shows results for instance “3bii.” The bolded valuesrepresent a best comparison to the benchmark solution, which has 12intermediate batches.

bolded row represent a direct comparison with thebenchmark intermediate configuration.

The most direct application of our work has been inP&G’s North American dry laundry portfolio, whichconsists of more than 20 unique formulations. Ourobjective was to study different strategies for sim-plification and savings, while delivering the same orbetter consumer-relevant cleaning performance. Thechurn team spent two days running and analyzing

Cost diff. vs. benchIntermediates ($ in millions) Bounds (%) Run time (sec)

3 −7604 2046 256064 −7908 1042 256075 −8109 0079 256086 −8206 0058 257007 −8300 0043 257028 −8305 0028 257039 −8307 0021 25704

10 −8309 0015 2570511 −8400 0012 2570712 −8401 0009 2570813 −8402 0007 2580014 −8403 0005 2580115 −8403 0002 2580316 −8404 0001 2580425 −8404 0000 6805

Table 3: In these results for the Wenlock instance, the bolded values rep-resent a best comparison to the benchmark solution, which has 13 inter-mediate batches.

Dow

nloa

ded

from

info

rms.

org

by [

146.

83.1

29.9

9] o

n 02

Nov

embe

r 20

15, a

t 04:

47 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


more than 30 scenarios. Here, we discuss two exam-ples from the study results.

• Increasing the number of intermediates allowedfrom five to 10 would result in additional formula costsavings. Compared to a five-intermediate benchmark,using the optimization procedure would result inannual cost improvements of $2 million for fiveintermediates and $5 million for 10 intermediates.Although a consequence of this strategy is to increasecomplexity at the manufacturing site for handlinga higher number of intermediate batches, this is ajustifiable decision based on this demonstrable costreduction.

• The introduction of a new active ingredient (cur-rently not in the North American powder formula-tion, but available in empirical and semi-empiricalmodels) across the whole portfolio would generateannual savings in excess of $20 million, while deliv-ering target performance profile (i.e., cleaning of dif-ferent stains) for each product. P&G has incorporatedthis knowledge into its short- to mid-term strategiesfor this part of our business.

We have estimated that without the portfolio opti-mization tool, twice as many staff members wouldhave to work for far longer periods of time to run30 scenarios on 20 formulations using the formersingle-product-at-a-time approach and would pro-duce inferior results.

User sponsorship and adoption have been instru-mental to the success of this project. In an email tothe team, Christian Becerra, P&G senior researcherand lead formulator for the North American powderbusiness, provided a set of additional benefits of ourportfolio optimization framework that go beyond thesavings described earlier (Becerra 2014):

• Smart optimization: Identifying a formula andprocess strategy that meets our criteria, removingchemistry that will not deliver the desired perfor-mance profile to the consumer. This leads to smartsavings.

• Next level of optimization: Going beyond the tra-ditional single-formula to full-portfolio optimization(i.e., the ability to see the big picture).

• Flexibility: Integrating technical features and anunderstanding of consumer needs to make morerobust portfolio propositions.

• Redefining the cleaning vision: Differentiatingperformance between brands and maintaining brandequity.

• Exploring out-of-the-box concepts: Exploring cur-rent production constraints and raw material ingredi-ents, but with the flexibility to integrate deviationsthat may lead to better results in performance andprocess.

• Managing what-if scenarios: Saving time andresources.

• Multifunctional integration: Performing a morerobust proposition versus isolated optimization ef-forts based on function.

Summary of BenefitsPortfolio optimization is changing the way we doproduct development at P&G. Previously, we wereoften limited to one of two strategies; we developedeach product individually, resulting in highly complexportfolios that required high numbers of intermediatebatches, or we imposed simplification strategies thatresulted in higher formulation costs.

Portfolio optimization allows us to test formulationand simplification strategies against the whole port-folio, giving us a realistic estimate of the potentialimpact of these strategies. Fast iteration cycles allowus to evaluate multiple strategies in a short time, dis-carding elements that will bring little value and com-bining elements that provide significant advantages.The ability to analyze an entire portfolio simultane-ously is also changing the way product designersthink about performance, because we can now moreeasily differentiate performance among the products.Thus, we can make smarter decisions about formu-lation and simplification strategies and respond withagility when needed.

As lead strategies are identified and the formula-tion for the full portfolio is generated, some phys-ical testing is required to confirm that the requiredperformance and physical properties of the productsare indeed met. This is especially true for solutionsthat are near the minimum or maximum values ofthe input ranges, because the confidence intervals ofthe predictions are typically at their widest in theseranges. Future phases of this project will address thisneed, and continuously expand and improve the qual-ity of predictive and optimization models.

Dow

nloa

ded

from

info

rms.

org

by [

146.

83.1

29.9

9] o

n 02

Nov

embe

r 20

15, a

t 04:

47 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


As a direct result of using this optimization plat-form and measuring its value to our business, weare now making the following enhancements to ourprocesses:

• Evolving P&G and its work processes to betteruse this and other operations research capabilities todefine and execute the best portfolio strategies.

• Investing in resources to improve our datamanagement system and automate data pull andformatting.

• Building more models into the optimization frame-work and continuing to do so as new models becomeavailable, potentially including consumer models andfirst-principles chemistry models.

Finally, although the benefits of this tool have beenwell demonstrated in our laundry-powder business,many reapplication opportunities remain at P&G. Inthe coming years, we expect to continue to developthis capability further for our laundry business andcreate similar capabilities for other P&G businesses.

Appendix

Semi-empirical ModelsStain-removal performance is predicted by the stain-removal index (SRI) response function, which has the fol-lowing form:

SRI =C0 +C1v1 +C2v2 +C3v1v2 + · · · 1

where Ci represent coefficients and vi represent design vari-ables: wash concentrations (milligrams per liter) of chemicalingredients and wash conditions (e.g., temperature). White-ness models have a similar form.

Intermediate Batch Density

CDb4�′

b5=C0 CIb4�′

b5+C1 V3 +C2 V4 + · · ·

CIb : “True density” of an intermediate batch, alsoknown as absolute density; the density of only thesolid components of the batch post-evaporation.

CDb : “Bulk density” or the density accounting for themass of solid components and air entrapped in theparticle.

�′b : Vector �′

ib of postevaporation mass fraction ofingredient i in intermediate b.

C01C11C21 0 0 0 : Regression coefficients.Vj : Variables that define processing and chemistry

parameters (e.g., temperature, hardness, �′b)

CIb4�′

b5=1

∑ni=1 �ib/�i

0

�i: Liquid density of pure component i.

Product Density

FDk4zk1xk5=1

∑ni=14zi/4�ifi55+ xk/CDb4�

′b50

zk: Vector of mass fractions zik of finishing additive i infinished product k.

xk: Mass fraction of assigned intermediate in production offinished product k.

�i: Density of finishing additive i.fi: Packing factor of finishing additive i.

Optimization Model

Notation

Sets• P: Set of premixes.• I: Set of ingredients (raw materials).• B: Set of intermediate batches.• K: Set of finished products.

Parameters• n: Number of intermediates required in the solution.• v̄pb : Mass percentage upper bound of premix p in pro-

duction of pre-dried intermediate b.• v̄′

pb : Mass percentage upper bound of premix p in pro-duction of postevaporated intermediate b.

• z̄pk: Mass percentage upper bound of premix p in pro-duction of finished product k.

• �ip: Mass percentage of ingredient i in production ofpremix p.

• cp: Cost of premix p, in $ per metric ton.• qsk (stat factor): Number of doses in a stat unit for prod-

uct k.• qvk : Production target of product k: number of stat units

sold per year.• qwk (wash volume): Volume of water in a single wash

for product k, in liters.• qdk (“dosage”): Grams of product k in wash machine

for one wash cycle, in liters.• R̄k: Maximum rate of evaporation limit for product k.• r̄b : Maximum evaporation capability for intermediate

batch b.

Decision Variables• vpb ∈ 601 v̄pb7: Mass percentage of premix p in produc-

tion pre-evaporated intermediate b.• eb : Evaporation variable for intermediate b.• zpk ∈ 601 z̄pk7: Mass percentage of premix p in produc-

tion of finished product k as an additive (not subject toevaporation).

• xk ∈ 60117: Mass percentage of intermediate componentin production of product k.

• ybk = 1 if finished product k is assigned to intermediatebatch b, 0 otherwise.

Dow

nloa

ded

from

info

rms.

org

by [

146.

83.1

29.9

9] o

n 02

Nov

embe

r 20

15, a

t 04:

47 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


Because mixing decisions are made at a premix level,variables v, z, and y take a p ∈P index, thus specifying pre-mix composition. The evaporation process is expressed interms of ingredient compositions, requiring us to calculateingredient mass percentage quantities pre- and postevapo-ration for each ingredient i in each intermediate b:

�ib =∑

p∈P

�ipvpb1 ∀ i ∈I1 b ∈B1 (1)

�′

ib = eb�ib1 ∀ i ∈I\8w91 b ∈B1 (2)

eb 41 −�wb5= 1 −�′

wb1 ∀ b ∈B1 (3)

where the subscript w is used to denote water. Similarly,empirical and semi-empirical functions are expressed interms of ingredient mass percentage at the product level,requiring us to calculate these values for each ingredient iin each product k:

�ik =∑

p∈P

�ipzpk + xk∑

b∈B

ybk�′

ib1 ∀ i ∈I1 k ∈K0 (4)

Constraints. Fundamental physical requirements thatcharacterize the mixing process are modeled as constraints.Intermediate and final-product mixtures have mass percent-age ingredient contributions that must add to 100 percent:∑

p∈P

vpb = 11 ∀ b ∈B and∑

p∈P

zpk + xk = 11 ∀k ∈K0 (5)

The rate of evaporation Rk is limited by

Rk = qvkxk∑

b∈B

4�w1 b −�′

w1 b5ybk ≤ R̄k1 ∀k ∈K0 (6)

Additionally, there are physical limitations on the amountof water that can be evaporated in the intermediate batch:

�w1 b −�′

w1 b ≤ r̄b1 ∀ b ∈B0 (7)

Empirical and semi-empirical models are based on exper-imental designs that are valid only within specific values,and accuracy can degrade severely if extrapolated beyondthese bounds. Furthermore, compositions cannot differdrastically from benchmark mixtures, and the water contentof powder products must be strictly controlled. Therefore,ingredient mass percentage values must lie within prede-fined lower and upper bounds:

�ik ≤�ik ≤ �̄ik1 ∀ i ∈I1 k ∈K0 (8)

Similarly, bounds must be imposed on ingredient composi-tions in the intermediate batches:

�ib ≤ �ib ≤ �̄ib1 �′

ib ≤ �′

ib ≤ �̄′

ib1 ∀ i ∈I1 b ∈B1 (9)

which include, most importantly for the role they play inthe manufacturing of intermediate batches, constraints onwater content.

Empirical models impose constraints on product perfor-mance. The term Fk4�k5 represents a vector for each productof all SRI and whiteness functions, each row characterizedby third-order polynomial expressions of the product’s com-position �ik and parameters temperature, hardness, and soillevel. Products created by the optimization must achieve aminimum level of performance, which is defined by vectorfk for each product:

Fk4�k5≥ fk1 ∀k ∈K0 (10)

Semi-empirical models impose constraints on productdensity,

FDk ≤ FDk4zk1xk1ybk5≤ FDk1 ∀k ∈K1 (11)

where we recall that FDk4zk1xk1ybk5 is a nonlinear functionrepresenting the density of product k, which is dependenton ybk through its corresponding intermediate batch den-sity CDb .

Finally, we recall that a requirement of the process is thateach product is assigned to exactly one intermediate batch:

∑

b∈B

ybk = 11 ∀k ∈K0 (12)

Objective. The optimization objective is to minimize thetotal cost of premixes used in the mixing process, weightedby product dosage per stat unit (qsk, a unit of demand) andproduct production-volume targets (qvk ), using at most nintermediates:

min∑

k∈K

qskqvk

∑

p∈P

cp

(

zpk + xk∑

b∈B

ybkebvpb

)

0 (13)

A useful quantity in the presentation to follow is the costof production of a subset of products A⊆K for a given setof values x, y, e, v, and z:

cA =∑

k∈A

qskqvk

∑

p∈P

cp

(

zpk + xk∑

b∈B

ybkebvpb

)

0

Optimization Solution Methodology

Our algorithm is based on a column-generation heuris-tic, where a set-covering master problem interacts withindependent subproblems that prescribe intermediate-to-product groupings and mixture compositions. The algo-rithm is based on the following sequence of steps:

1. Singleton2. Grouping3. Configuration4. Selection5. Return to step 2 until convergence.

Singleton StepTo start the process, we solve for the artificial case inwhich each product is allowed to have its own dedi-cated intermediate batch. This is equivalent to specifying K

Dow

nloa

ded

from

info

rms.

org

by [

146.

83.1

29.9

9] o

n 02

Nov

embe

r 20

15, a

t 04:

47 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


Common intermediates

B1 (pre) B1 (post)e1

Pre1

Premixes

Pre2

Pre3

Pre4

Products

Prod1

Figure A.1: The singleton subproblem layout for �Prod1� is much sim-pler than the full-problem layout of Figure 6 and is independent of otherproducts.

independent problems, each with n=1. For an example port-folio �P1�P2�P3�, this requires us to solve three indepen-dent mixture-optimization problems comprising �P1�, �P2�,and �P3�. Figure A.1 shows the corresponding �Prod1� sin-gleton of the problem illustrated in Figure 6.

We use the SAS/OR interior point nonlinear program-ming solver for the singleton subproblems, which are inde-pendent and can therefore be solved in parallel by enablingthe SAS cofor multithreaded processing capability. Compli-cations exist, however, primarily because of bilinear termsand nonconvex empirical and semi-empirical functions.We address these complications by employing the multi-start mechanism of the SAS nonlinear programming (NLP)solver (also threaded), which aims to improve the likelihoodof finding globally optimal solutions. Note that the single-ton step subproblem is a specialization of the configurationstep problem, where �=� and ykk = 1.

Although global optimality is neither provable nor guar-anteed, we will describe a method for improving the ulti-mate quantity that is to be derived from these solutions: theglobally optimal cost ck of each singleton. For now, we beginto build a groupings pool � as the union of all singletons.In the previous example, �= ��P1�� P2�� P3��.

Grouping StepWe exploit a physical observation to generate a relativelysmall number of promising product groups to approximatethis idea. The observation relies on the premise that expect-ing products with similar performance requirements to beable to benefit from extracting a portion of their mixturefrom the same intermediate is reasonable. Singletons areideal chemical compositions for products because they ben-efit from dedicated intermediates. Grouping finished prod-ucts based on the similarity of their singleton intermediatebatch compositions is known to be beneficial by observationin practice. For example, products that must meet stringent

SRI requirements for the same stains often share an inter-mediate to reduce costs; in isolation, they would have verysimilar intermediate batches.

To exploit this observation, we define a metric of simi-larity between optimal (or best-known) intermediate com-positions of singleton solutions. For any two singletons kand l,

�kl =∑

p∈�

�cp��′

pk −�′

pl�� (14)

represents a sum of the absolute-value difference of masspercentage of premixes in their respective singleton poste-vaporation intermediates, weighted by the cost of premix,where we have used the product indices to represent corre-sponding singleton pairs.

The goal of the grouping step is to generate candidategroups of products that minimize the sum of �kl, such that nintermediate batches are used. We accomplish this by defin-ing a model based on bipartite assignments, where eachproduct is eligible to be paired with each singleton’s optimalintermediate. Binary variables determine which intermedi-ates should be used, and arc variables determine prospec-tive intermediate-to-product pairings. These are then inter-preted to specify product groups that are appended to �(each composed of a subset of products).

Sets• �: Current groupings pool.• ��= ��k� l� ∈�×�� k≥ l�.

Parameters• �kl: Similarity measure; see Equation (14).

Variables• vl = 1 if the intermediate of product l is used in calcu-

lating �kl for all k to be grouped with l, 0 otherwise.• ukl = 1 if product k is assigned to intermediate of sin-

gleton l, 0 otherwise.• A = 1 if group A ∈ � is selected into the solution, 0

otherwise.

Grouping Problem Formulation

Minimize∑

�k� l�∈��

�klukl (15)

subject to∑

l∈�

ukl = 1 ∀k ∈�� (16)

∑

l∈�

vl = n� (17)

ukl ≤ vl ∀ �k� l� ∈�� (18)

vk ≤ 1−ukl ∀ �k� l� ∈�� k �= l� (19)

ukl ∈ �0�1� ∀ �k� l� ∈�� (20)

vl ∈ �0�1� ∀ l ∈�� (21)

Inequality (19) ensures that if a product has been assignedto an intermediate batch of another product’s singleton,

Dow

nloa

ded

from

info

rms.

org

by [

146.

83.1

29.9

9] o

n 02

Nov

embe

r 20

15, a

t 04:

47 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


its own singleton intermediate batch becomes ineligible forgrouping other products.

Groups implied by ukl are appended to � for furthersteps in the algorithm (i.e., configuration and selection).Our definition of �� ensures no symmetry in the prob-lem, allowing us to use ukl to uniquely map �� to �. Forexample,

u=

11 00 0 1

has a one-to-one mapping to ��P1�P2�� P3��. Multiple itera-tions of the algorithm build out � by appending nonzerocolumns of each solution u to �. For example,

�= ��P1�� P2�� P3�� P1�P2��

created by appending the above solution to a groupingspool that has been initialized with singletons, is representedin matrix form as

�=

1 11 1

1

�

Because the purpose of this step is to enrich � to providea better approximation of the optimal members of ��,each iteration (i.e., each time the grouping problem is called)must produce n groups of which at least one is not currentlyin �. We accomplish this by adding to the above formula-tion variable A and constraints

∑

l∈A

�1−ulk�+∑

l∈�\A

ulk ≥ 1− A ∀A ∈�� k ∈�� (22)

and∑

A∈�

A ≤ n− 1� (23)

where we interpret A as columns of �, l ∈A as rows of �with values of 1, and l ∈�\A as rows of � with values of 0.

We can interpret the variable as follows. For an existinggroup A ∈�, if A = 0, then Inequality (22) ensures that A isnot a group in the solution of the grouping step as specifiedby u, and if A = 1, the constraint is relaxed. Inequality (23)allows the model to relax this condition up to n− 1 times.For example, when solving for a five-intermediate problem,Inequalities (22) and (23) together ensure that a new solu-tion is produced that allows for at most four intermediatebatches already in �.

Configuration StepGiven a group of products, the production optimizationproblem becomes a continuous NLP. Figure A.2 illustratesthe example of Figure 6 when we impose a group �Prod2,Prod3�. Note that there is a single intermediate, and there-fore no need to identify the intermediate batch by index.

Pre1

Premixes

Common intermediate

Pre2 Pre Poste

Products

Prod2

Prod3

Pre3

Pre4

Pre5

Figure A.2: The configuration subproblem layout shows that only one com-mon intermediate exists, eliminating the need for the binary variable thatassigns products to intermediates.

This variant of the problem has no integer variables,allowing us to use standard NLP solvers. Because the sin-gleton problem can be interpreted as a special case of theconfiguration problem where only one product exists, weimplement the singleton computation by using the configu-ration model code.

In the configuration step, we solve the related NLPs toidentify (locally) minimum-cost values of producing theseindependent groups of mixtures. These problems are solvedin parallel using SAS NLP solver-threaded capability.

Selection StepThe selection step is simply a cardinality-constrained setcovering of � using groups in � as eligible subsets. Thegoal is to minimize the total optimal cost of production

c∗��n =∑

Ai∈T∗��n

cAi

with an optimal partition T ∗��n, where �T ∗

��n� = n. Because �is a relatively small set, we can easily solve this problemwith the SAS/OR MILP solver.

Parameters• xkA: Mass percentage of intermediate component in

production of product k for solution of group A.• vpA: Mass percentage of premix p in production of pre-

evaporated intermediate for solution of group A.• ckA: Cost of producing product k using group A

solution:ckA = qskq

vk

∑

p∈�

cp�zpk + xkAeAvpA�� (24)

Dow

nloa

ded

from

info

rms.

org

by [

146.

83.1

29.9

9] o

n 02

Nov

embe

r 20

15, a

t 04:

47 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


Variables• sA = 1 if group A is used in the solution, 0 otherwise.• wkA = 1 if intermediate of product k is derived from

group A, 0 otherwise.

Selection Problem Formulation

Minimize∑

A∈A

∑

k∈A

ckAwkA (25)

subject to∑

A∈A2 k∈A

wkA = 1 ∀k ∈K1 (26)

∑

A∈A

sA = n1 (27)

∑

k∈A

wkA ≥ sA ∀A ∈A1 (28)

wkA ≤ sA ∀A ∈A1 k ∈A1 (29)

sA ∈ 80119 ∀A ∈A1 (30)

wkA ∈ 80119 ∀A ∈A1 k ∈A0 (31)

Algorithm Enhancements (Augmentation and Bounding)Initially, when we start with singletons and the first roundof groupings, A contains at most K + n subsets. We notethat the similarity-based grouping step attempts to heuristi-cally exploit an observation of optimal intermediate compo-sitions but is not guaranteed to lead to optimal intermediatebatch groups. Furthermore, because of the nonconvexity ofthe problem, the configuration step likely produces solu-tions that are not globally optimal.

To address these issues, we iterate through the group-ing, configuration, and selection steps to accomplish twogoals: (1) improve our approximation of the optimal regionof P4K5 by augmenting A; and (2) improve the global opti-mality of cost values by evaluating subgroup relationships.

Item (1) occurs automatically within the formulation bythe inclusion of constraints that require at least one newgroup not already in A to be generated, ensuring that atleast one next-best product group is appended to A for sub-sequent steps. Item (2) requires more explanation, which weillustrate in the following example.

Splitting a group A into a partition TA allows each sub-group of the partition to have its own intermediate batch(and therefore more flexibility in the choice of mixture com-position). Global optimality thus requires that

cA ≥∑

Ai∈TA

cAi0 (32)

When we restrict solution sA to the products in Ai ⊆ A (byeliminating any k ∈A\Ai), it becomes feasible for any suchsubgroup. It is therefore advantageous to replace sAi

by sAwhenever the condition cA < cAi

is detected for any subsetof A. This check is performed for all groups in the solutionpool A and is motivated by our knowledge that a knownlocally optimal solution of Ai might be inferior (more costly)

to a known solution of A because of the effect of noncon-vexity. The process ensures that we provide the best-knowncoefficients for the objective function in the selection step byexploiting the hierarchy in the problem, even if we have notconverged to global optimality for every NLP calculation inthe configuration step.

We further exploit this idea to augment A with all parti-tions of its members, because initializing all such partitionswith the solution of their supersets is possible. For example,

A={

8P1918P2918P3918P4918P5918P21P3918P11P41P59}

(33)

can be augmented with 88P11P491 8P11P591 8P41P599, witheach new subgroup being initialized with the restricted por-tion of s8P11P41P59

and c8P11P41P59.

We cannot guarantee global optimality for the configu-ration step because of the nonconvexity of the problem;therefore, cA can only be regarded as an upper bound tothe production cost of a group A. Ideally, for the solutionmethodology to converge with some confidence of optimal-ity, it would be helpful to identify lower bounds cA for costsof groups in P4A5 for which we have not solved the associ-ated NLPs. We illustrate this using the following example.

Consider a scenario in which we are solving a five-product, two-intermediate problem and have built A de-fined in Equation (33). Such a pool would be constructedby the solution of the singleton step with one augmentationfrom the grouping step that produces 88P21P391 8P11P41P599.We solve an NLP for each member of A to calculate its cor-responding cost. Although we have not solved the NLP forthe group 8P11P21P39, we can estimate its lower bound as aconsequence of Inequality (32),

c8P11P21P392= max

{

4c8P19+ c8P29

+ c8P3951 4c8P19

+ c8P21P395}

0

We define A=A∪ 88P11P21P399 and the corresponding opti-mal cost (derived by calculating T ∗

A in the selection step):

c∗

A1n =∑

Ai∈T∗A1n

cAi0

We could augment A with all members of P4A5 becausethe lower-bound estimate can always be calculated fromsingletons. For large problems, P4A5 is prohibitively large;therefore, we instead augment based on building supersetsof existing groups.

We have been careful to call this an estimate of the lowerbound because nonconvexity could prevent us from accu-rately calculating the costs of each Ai, because cAi

is only anupper bound. Therefore, a true lower bound cannot be guar-anteed (although it continues to improve as the algorithmiterates). Given some desired optimality gap �n, we use

�̃n 2= 4c∗

A1n − c∗

A1n5/c∗

A1n ≤ �n (34)

as a heuristic stopping criterion, with the expectation thatthe algorithm might terminate prior to achieving true globaloptimality within �n.

Dow

nloa

ded

from

info

rms.

org

by [

146.

83.1

29.9

9] o

n 02

Nov

embe

r 20

15, a

t 04:

47 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.


Algorithm SummaryHere, we summarize the steps in the algorithm.

1. Singleton step. Let iteration count m = 0. Initialize Awith all singletons, solve singleton NLP configuration prob-lems, and calculate similarities ãkl for all pairs of singletons.

2. Grouping step. Let m = m + 1. Solve the groupingproblem and append candidate groups to A, including par-titions of each group. Also, append to A additional groupsderived from combinations of members of A.

3. Configuration step. In parallel, solve independent con-figuration problem NLPs for new groups of A.

4. Selection step. Solve set-covering problems for upperbounds c∗

A1n and lower-bound estimates c∗A1n, respectively.

5. Terminate the algorithm if Inequality (34) holds; elsego to step 2.

References

ASTM International (2014) Standard Guide for Evaluating StainRemoval Performance in Home Laundering (ASTM International,West Conshohocken, PA).

Audet C, Brimberg J, Hansen P, Le Digabel S, Mladenovic N (2004)Pooling problem: Alternate formulations and solution meth-ods. Management Sci. 50(6):761–776.

Becerra C (2014) Observed benefits of the portfolio optimizationapproach provided via email communication with NatalieEsquejo, June.

Bodington CE, Baker TE (1990) A history of mathematical program-ming in the petroleum industry. Interfaces 20(4):117–127.

Box GEP, Stuart Hunter J, Hunter WG (2005) Statistics forExperimenters2 Design, Innovation, and Discovery, 2nd ed. (JohnWiley & Sons, Hoboken, NJ).

Burnham KP, Anderson DR (2002) Model Selection and MultimodelInference2 A Practical Information-Theoretic Approach, 2nd ed.(Springer, New York).

Burnham KP, Anderson DR (2004) Multimodel inference: Under-standing AIC and BIC in model selection. Sociol. Methods Res.33(2):261–304.

Dey S, Gupte A (2013) Analysis of MILP techniques for the pool-ing problem. Accessed April 1, 2015, http://www.optimization-online.org/DB_FILE/2013/04/3849.pdf.

Floudas CA, Aggarwal A (1990) A decomposition strategy forglobal optimum search in the pooling problem. ORSA J. Com-put. 2(3):225–235.

Goos P, Jones B (2011) Optimal Design of Experiments2 A Case StudyApproach (John Wiley & Sons, Hoboken, NJ).

Gupte A, Ahmed S, Dey S, Cheon M (2013) Pooling problem: Relax-ations and discretizations. Accessed April 1, 2015, http://www.optimization-online.org/DB_FILE/2012/10/3658.pdf.

Johnson RT, Montgomery DC, Jones BA (2011) An expository paperon optimal design. Quality Engrg. 23(3):287–301.

Kutner M, Nachtsheim C, Neter J, Li W (2004) Applied Linear Sta-tistical Models, 5th ed. (McGraw-Hill/Irwin, New York).

Miller AJ (1990) Subset Selection in Regression (Chapman & Hall,London).

Visweswaran V, Floudas CA (1990) A global optimization algo-rithm (GOP) for certain classes of nonconvex NLPs—II. Appli-cations of theory and test problems. Comput. Chemical Engrg.14(12):1419–1434.

Nats Esquejo received her Bachelor of Science degree inchemical engineering from the University of the Philippines

in 1996 and is a section head in R&D at Procter & Gamble.She has extensive experience in both process and formu-lation design in the fabric and home care business and isworking in the modelling and simulation group, focused onintegration and optimization of models, and application ofbig data techniques. She also conducts training and consultson design of experiments and process control techniquesinternally in P&G.

Kevin Miller received his bachelor’s degree in chemistryfrom Xavier University in 1999. He is a principal researcherat Procter & Gamble working in Fabric and home care mod-elling and simulation. He started in laundry product designfor North American Granules products and continued prod-uct design in Central Eastern European and Latin AmericanGranules. His current focus in modelling and simulation ison model integration and product optimization.

Kevin Norwood received his PhD in physical chemistryfrom Iowa State University in 1990 and is a research fellowin R&D at Procter & Gamble. He leads technical work tocreate and apply modeling approaches to formulate prod-ucts within the fabric and home care businesses. His currentwork is focused on integration of models across disciplines.He started with P&G in 1991 and has worked in analyticalscience, technology, formulation, and modeling, where hehas spent the majority of his career.

Ivan Oliveira manages the Advanced Analytics and Op-timization Services (AAOS) group at SAS, where he hasdirected projects in operations research (OR) and optimiza-tion applications in a variety of industries. AAOS projectsdeliver consulting expertise to SAS customers in the fieldof OR, inventory optimization, revenue management andprice optimization, and related technologies. A sample ofAAOS projects includes optimal scheduling for ATM cashreplenishment, portfolio optimization in government, rev-enue management in various industries, retail inventoryreplenishment and pricing, chemical mixture portfolio opti-mization in CPG, optimal assignment of delinquent loanprocessing, and simulation for drug discovery. AAOS isalso engaged in internal SAS R&D projects, including opti-mization for data mining. He earned his BS in mechani-cal engineering at the University of Virginia and MS andPhD in mechanical engineering at Massachusetts Instituteof Technology.

Rob Pratt has worked at SAS since 2000 and is a seniormanager in the Operations Research Department withinSAS R&D’s Advanced Analytics Division. He manages ateam of developers responsible for the optimization model-ing language, network algorithms, and the decompositionalgorithm. He earned a BS in mathematics (with a secondmajor in English) from the University of Dayton, and bothan MS in mathematics and a PhD in operations researchfrom the University of North Carolina at Chapel Hill.

Ming Zhao is assistant professor in the Departmentof Decision and Information Sciences at the University ofHouston. He served as senior operations research spe-cialist in Advanced Analytics and Optimization Services

Dow

nloa

ded

from

info

rms.

org

by [

146.

83.1

29.9

9] o

n 02

Nov

embe

r 20

15, a

t 04:

47 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.

http://www.optimization-online.org/DB_FILE/2013/04/3849.pdf





(AAOS) group at SAS. The projects he has worked oninclude chemical mixture portfolio optimization for P&G,operating rooms scheduling, renewable energy integrationand power system operations, retail inventory replenish-ment and pricing, and optimization for data mining. He

earned his PhD from University at Buffalo in 2008 andwas a postdoctoral researcher at IBM T.J. Watson ResearchCenter, where he worked primarily on unit commitmentproblem and supply chain management in the miningindustry.

Dow

nloa

ded

from

info

rms.

org

by [

146.

83.1

29.9

9] o

n 02

Nov

embe

r 20

15, a

t 04:

47 .

For

pers

onal

use

onl

y, a

ll ri

ghts

res

erve

d.

inte2e20152e0802

Documents

kevin miller

laundryportfolio optimization

kevin norwoodprocter

ivan oliveira

information sciences

online http

subscription information

conditionsthis article