statistical modeling in stochastic dynamic programming for a decision-making framework dr. julia c....
Post on 18-Dec-2015
215 Views
Preview:
TRANSCRIPT
Statistical Modeling in Stochastic Dynamic Programming for a Decision-Making Framework
Dr. Julia C. Tsai
Krannert School of ManagementPurdue University
Krannert
School of Management
December 15, 2003
Outline
• Decision-Making Framework
• Stochastic Dynamic Programming
• Statistical Modeling within the DMF
• Multivariate Adaptive Regression Splines
• Parallel MARS
• Flexible Implementations of MARS
• DMF Results
• Conclusions
A Modular Decision-Making Framework
For time period/level/stage t:
xt = state of system ut = decision/control
Stochastic Dynamic Programming
To solve a problem of different periods/levels/stages
Applications:
• Inventory Forecasting: -- Up to 9 dimensions (Chen 1999)
• Airline Revenue: -- 31 flight legs (Chen, Günther, Johnson 2000)
• Wastewater Treatment System: -- 20 dimensions (Tsai et al. 2002)
Inventory Forecasting
Modeled by Heath and Jackson (1991) using the Martingale Model of Forecast Evolution
Objective: Minimize inventory holding and backorder costs.
Time Periods/Levels/Stages: Months, weeks.
State xt at the beginning of Stage t: Inventory levels and product forecasts.
Decision ut in Stage t: Amount ordered.
Constraints: Capacities on order quantities.
Random Variables: Errors in the forecasts.
Transition: For inventory xt+1 = xt demand + order quantity.
Airline Revenue Management
Research with Ellis Johnson (Georgia Tech), Dirk Günther (Sabre), and Jay Rosenberger (UTA)
Objective: Maximize revenue before a specified departure date.
Time Periods/Levels/Stages: weeks, days.
State xt at the beginning of Stage t: Remaining capacities on the flight legs in the network.
Decision ut in Stage t: Accept or Reject a customer’s airfare request for a specified origin-destination itinerary.
Constraints: Capacities on flight legs.
Random Variables: Customer demand.
Transition: xt+1 = xt # seats sold in stage t.
Wastewater Treatment System[1]
• 11-level liquid line and 6-level solid line• At each level, select one of several unit
processes to complete the treatment system
Objectives:
• Evaluate various technologies in different levels• Identify which technologies should be explored
more in the future
1 Developed by Dr. Bruce Beck and Dr. Jining Chen
State Variables: To measure the quality of water
Liquid Solid
Chemical oxygen demand
Suspended solids
Organic-nitrogen
Ammonia-nitrogen
Nitrate-nitrogen
Total phosphorus
Heavy metal
Synthetic organic chemicals
pathogens
viruses
Volume of Sludge
Sludge water content
Sludge organic-carbon
Sludge inorganic-carbon
Sludge organic-nitrogen
Sludge ammonia-nitrogen
Sludge total phosphorus
Sludge heavy metal
Sludge synthetic organic chemicals
Sludge pathogens
Technology Units
1 Flow equilisation tank
2 Vortex SSO
Sedimentation tank
Chemical precipitation
3 Physical irradiation
Ozonation
5 Activated sludge(C)
Activated sludge(C,N)
Activated sludge(C,P)
Activated sludge(C,P,N)
High biomass activated sludge
Activated sludge(N)
Multi reactor and deep shaft system
A-B system
Trickling filter
Rotating biological contactors
UASB system
Reed bed system
Lagoons and ponds
6 Sedimentation tank
Microfiltration
Reverse osmosis
Chemical precipitation
7 Physical filtration
Microfiltration
Reverse osmosis
Chemical precipitation
8 Physical irradiation
Ozonation
9 Air stripping
Ammonia stripping
10 Chlorine disinfection
Chlorating disinfection
11 GAC adsorption
Infiltration basin
Liquid Line:
1 (12) Sludge storage tank
Sludge thickening tank
2 (13) Sludge dewatering bed
Sludge Garver-Greenfield drying unit
Sludge Vertech system+ammonia stripping
Sludge CWOP-UASB process+ammonia stripping
Sludge hydrolysis + UASB
Anaerobic digestion
Aerobic digestion
Aerobic-anaerobic digestion
3 (14) Filter and belt
Permanent thermal process
Thermo-chemical liquefaction process
4 (15) Sludge dewatering bed in second stage
5 (16) Physical irradiation
6 (17) Chemical fixation
Incineration
Thermal treatment for building materials
Solid Line:
Objectives of the SDP:
To minimize • Economic Cost (Capital & Operating)• Odor Emissions• Size of treatment system (land area or
volume)or Maximize• Robustness against extreme conditions• Desirability of the global environment
Constraints:1. Cleanliness of the influent entering each level
2. Stringent clean water targets exiting the final level of the system
Stochastic Dynamic Programming (SDP)
Objective: Minimize expected cost over T stages.
Optimal Value Function Ft(xt) in Stage t: Minimum expected cost to operate the system over stages t through T.
Algorithm for Continuous-State SDP
1. Choose S discretization points in the state space.
2. In each stage t = T,…,1:
a. At each discretization point xj, j = 1, … , S:
minimize the expected cost value of
b. Approximate with
(Chen, Ruppert, Shoemaker 1999)
SDP Period/Stage/Level t+1
State VectorValues
SDP Period/Stage/Level t
Optimization
Data for theFuture Value Function
EstimatedFuture Value Function
SDP Period/Stage/Level t-1
ExperimentalDesign
StatisticalModel
Statistical Modeling Process
Design of Experiments
Each experimental run• sets each factor at a specific level• corresponds to a point in the n-dimensional space
Design of Experiments Options
FF: Full factorial or complete grid designs
OA: Orthogonal array designs (Bose and Bush 1952, Chen 2001)
LH: Latin hypercube designs (McKay et al. 1979)• OA-LH: Hybrid (Tang 1993)
Orthogonal Array Designs
OA Parameters:• n factors• strength d (d < n)• p levels• frequency
When projected down onto any d dimensions, it produces a FF grid of pd points replicated times.
A LH design is equivalent to an OA of strength 1.
Cubic Regression Splines
Univariate cubic regression splines commonly have the form:
Multivariate Adaptive Regression Splines
3
1
2
4
B1 = H[–(Xva–ka)] , B2 = H[+(Xva–ka)]
B3 = H[–(Xva–ka)]H[–(Xvb–kb)]
B4 = H[–(Xva–ka)]H[+(Xvb–kb)]
– +
+vb
va
ka
kb
–
MARS Forward Stepwise
1. Loop through potential new basis functions: Select parent basis function m Select variable v Select knot k
2. For each m, v, k: Compute lack-of-fit Compare to current best based on lack-of-fit
3. For the best m, v, k: Create two new basis functions
4. Continue searching for new basis functions until the stopping rule (e.g. Mmax) is met
Parallel MARS
• Master-Slave paradigm
• Software: MPI (Message-Passing Interface)
C1 C2
C0 : Select the overall best knot and update b.f.
CP-1
Meet Mmax?
C0 : Initialization/Data Processing.
STOP
YES
NO
Parallel MARS Algorithm
Parallel Performance Measure:tP : Time using Parallel MARS with P processors
t1 : Time using Parallel MARS with 1 processor
Speedup (SP) = t1/ tP
Computing Facility:Processor: 550 MHz Pentium III Xeon
Storage: 4 GB RAM, 18 GB SCSI disk
OS: RedHat Linux 7.1
Speedup vs. No. of Processors
1
1.5
2
2.5
3
3.5
4
1 2 3 4 5 6 7 8 9 10
No. of Processors
Sp
ee
du
p
Mmax=200
Mmax=150
Mmax=100
Mmax=50
Results
[ N = 289, K = 35]
The Drawbacks of MARS
• Mmax is difficult to select
– Different SDP time periods may require different Mmax for a good approximation
– Computational effort required to identify the best Mmax for each time period is impractical
• Multiple basis functions can be “equivalently” good based on lack-of-fit– MARS is a greedy algorithm– Final approximation may involve more higher-
order interaction terms than necessary
ASR-MARS(Automatic Stopping Rule)
Use of R2 and R2a :
(adjusted) coefficient of determination
• ASR-I: Stop MARS approximation search process when R2 < or R2
a <
• ASR-II: Stop MARS approximation search process
when R2 / R2 < or R2a / R2
a <
Mmax Relaxation: Slow vs. ASR-I ( =0.0002)
Run Time:Slow 36h19m39s
ASR 34m
Level 11 10 9 8 7
MAD Slow 137.2 91.68 71.95 51.90 9.69
ASR 133.8 105.4 74.03 42.73 37.82
M Slow 244 142 231 195 139
ASR 202 159 89 89 120
MAD (mean absolute deviation) & M (number of basis functions):
Results
Robust MARS
Choose lower-order interaction terms
For example:
The highest allowable interaction term is 3, then
three I(i, Bi) are used to store the best basis function (Bi):
I(1, B1) = among univariate options
I(2, B2) = among two-way interaction options
I(3, B3) = among three-way interaction options
YES
NO The best b.f. is B3
NO
YES
The best b.f. is B2
The best b.f. is B1
Start Assume I(3, B3) > I(2, B2) > I(1, B1)
Robust MARS Results
No Pruning
Pruning =0.3 =0.01
Runtime 1:39:13 1:52:40 1:59:19 1:47:56
M 194 167 194 196
MAD 134.64 139.35 131.92 124.37
Results
DMF Evaluation Measures
• Count = # times chosen as best • MOD = mean overall deviation• MLD = mean local deviation• MLRD = mean local relative deviation
A promising technology has higher Count and lower MOD, MLD, MLRD
DMF Solution (Count): Slow vs. ASR
Level Selected Units Slow ASR
2 Vortex SSO 2197 2197
3 Ozonation 2140 1992
5 UASB System 2134 1806
6 Microfiltration 1999 1893
7 Microfiltration 2167 1509
8 Physical Irradiation 1436 1359
9 Ammonia Stripping 2092 2191
10 Chlorine Disinfection 1869 1925
11 GAC Adsorption 1320 1320
Results
• Parallel-MARS: Speedup becomes more significant as Mmax increases
• ASR-MARS: Tremendously reduced runtime for the statistical modeling process, and selected the same promising technologies as “Slow” Mmax relaxation
• Robust MARS: Reduced the mean absolute deviation of the test data set, which suggested a better statistical model
Conclusions
top related