median predictive cost of error with an asymmetric cost function

Median Predictive Cost of Error with an Asymmetric Cost FunctionAuthor(s): Michael CainSource: The Journal of the Operational Research Society, Vol. 40, No. 8 (Aug., 1989), pp. 735-740Published by: Palgrave Macmillan Journals on behalf of the Operational Research SocietyStable URL: http://www.jstor.org/stable/2583680 .

Accessed: 28/06/2014 13:43

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

Palgrave Macmillan Journals and Operational Research Society are collaborating with JSTOR to digitize,preserve and extend access to The Journal of the Operational Research Society.

http://www.jstor.org

This content downloaded from 46.243.173.175 on Sat, 28 Jun 2014 13:43:28 PMAll use subject to JSTOR Terms and Conditions

http://www.jstor.org/action/showPublisher?publisherCode=pal

http://www.jstor.org/action/showPublisher?publisherCode=ors

http://www.jstor.org/stable/2583680?origin=JSTOR-pdf

http://www.jstor.org/page/info/about/policies/terms.jsp


J. OpI Res. Soc. Vol. 40, No. 8, pp. 735-740, 1989 0160-5682/89 S3.00 + 0.10 Printed in Great Britain. All rights reserved Copyright (&; 1989 Operational Research Society Ltd

Median Predictive Cost of Error with an Asymmetric Cost Function

MICHAEL CAIN Department of Economics, University College of Wales, Aberystwyth

The problem of reducing predictive cost is considered in the case when the cost of error function is not symmetric and the optimally criterion is the minimization of the median cost of the error of prediction. Examples are given and comparisons made with the usual solution based on the minimization of the mean cost. In the case of a Gaussian process, the median solution is found to be a simple additive adjustment to the predictive mean, and far easier to compute than the solution based on expected cost.

Key words: asymmetric cost function, median cost, prediction

INTRODUCTION

Consider a stationary sequence {X,} of continuous random variables for which it is required to predict X, +k (k = 1, 2, ...), given X,, X, I.... Let the conditional distribution function of X +k,

given X, X, ...,be Fk, the same for all t since the process is stationary. It is required to find the optimal predictor h = h (X,, X, -, . ..) for X, + k taking into account the cost g(X, +k- h) of the predictive error. Granger' considered this problem in the case of an asymmetric cost function g and the optimality criterion the minimization of the expected predictive cost, JR g(x - h) dFk(x). He obtained an explicit solution for an asymmetric linear cost function

(ax, x > O (a > O)

g(x) 0, X = O

(bx, x < O (b < O),

and gave some simple conditions that ensure that the predictive mean is the optimal predictor. Since the expected predictive cost is not always of prime concern and the problems encountered are not so analytically tractable, we take a different approach and consider the optimality criterion of minimizing the median predictive cost, with a more general continuous asymmetric cost function:

[91(X), x>0

g(x) = 1 4 . x = 0

t92 (4) x < 0,

where g,(O) = 0 = g2(0), g'(x) > 0 for x > 0 and g'2(x) <0 for x < 0.

MINIMIZING MEDIAN PREDICTIVE COST

The distribution function, k., of the predictive cost is

Th+g1 - t(u)

Fk(u) = dFk(x), +92g Iu)

and the median predictive cost, u*, satisfies

Fk(U*) - 2

Write 1l(u) = g1 I(u) > 0, 12(U) = g9- I(u) < 0 for u > 0. We seek h to minimize u* and hence differentiate implicitly with respect to h throughout the

identity T h +lIt(u)n

dFk(x) = 1, (1) h +1l2(u)}

735



Journal of the Operational Research Society Vol. 40, No. 8

to yield the first-order condition

{I + I' (u) du* }dFk(x)l 1(u + lI' ( du*)A }dFk(x)] = . { I( dh }[dx ]xhl/\ { 2( dh }[dx Xh1/\

Since for minimal u* we require du*/dh = 0, the first-order condition reduces to

LdFk(x)1 [dFk(x)1

dx x = h + I (u) L dx x = h + 12()

i.e. fk(h + 1l(u*)) =fk(h + 12(u*)), (2)

wherefk is the predictive density function of X,+k. The second-order condition for a minimum is

d2u* fk(h + 12(U*))-fk(h + 1l(u*)) > 0.

dh2 I'(u*)fk(h + 1l(u*)) - 12(u*)fk(h + 12(u*))

Subject to the second-order condition being satisfied, the optimal h, together with the corresponding optimal u*, satisfies (1) and (2), and in theory can be derived by solving these two equations in the two unknowns h, u*.

APPLICATION TO A GAUSSIAN PROCESS

To illustrate the procedure, consider a Gaussian process {X,} with the conditional distribution of X, +k given X,, X, ,, ... N(M, a2), where M is the predictive mean. Constraint (1) reduces to

h + li(u*) - M) - (h + 12(W*)-M) l (3)

where (1 is the distribution function of the standard normal distribution, and (2) reduces to

{h + l,(u*) -M}2 = {h + 12(u*) -M}2

or, equivalently,

1l(u*) + 12(U*) = 2(M - h). (4)

The second-order condition becomes

d2u* _ {(l(U*) - 12(u*)} > o

dh2 - 2 {l',(u*) - '2(u*)}(5

Note that if g is symmetric 12(u) = - l(u), u > 0, (5) holds and (4) gives the solution h = M; i.e. the predictive mean is optimal. More generally, (4) is equivalent to

{h + 12(2*) - M} =-{h + l,(u*)-M}

and (3) becomes

h + Il,(u*) -M = Ha~ ' (0.75),

which, using standard normal tables, yields

h + Il(u*) = M + 0.6745cr. (6)

Observe that (4) and (6) imply that the solution h is merely an additive adjustment to M; note that in both these equations, h appears only with M, in the form M - h. Moreover, the solution satisfies

g,(M - h + 0.6745a) = g2(M - h-0.6745),

736



M. Cain-Median Predictive Cost of Error

but to avoid inappropriate solutions of this equation, it is more convenient to continue to consider both equations (4) and (6). To proceed further it is necessary to be more precise about the functional form of g.

COMPARISONS USING AN ASYMMETRIC LINEAR COST FUNCTION

Consider the asymmetric linear cost function

g(x) = ax, x > 0

lbx x '< O.

where a > 0, b < 0. In this case, gl(x) = ax, 92(X) = bx and 11(u) = u/a, 12(u) = u/b for u > 0. From constraint (4) we obtain

* - 2(M - h)ab U a + b

and substituting in (6), the required h is seen to be

h = M + 0.6745a (a + b) (7) (a -b)' 7

merely an adjustment to the predictive mean, M. Clearly h > M if a + b > 0 and h < M if a + b < 0. Condition (5) reduces to

d2u* u* 2(M-h)ab -=-= ~~~> 0 dh 2 U U2 (oa +b)

and, since this is satisfied, (7) gives the optimal solution. The expected predictive cost of solution (7) is

2(0 .6745(a + b) + (0.6745Xa + b) a + (a-b)( 0.6745(a +) (a-b~o'4~ (a -b) + (a -b) [a( b)\ (a -b) ,J

where 4 is the probability density function of the standard normal distribution. For comparison, the minimal expected predictive cost attainable by any predictor h is

(a - b)o2 4(h*) + h*oi2[-a + (a -b)(h*)],

where h*a is the adjustment to M obtained by Granger,1 satisfying e$(h*) = a/(a - b). The minimal expected predictive cost is thus

(a - b)a24{1( - ))

and a measure of efficiency, in terms of expected predictive cost, of solution (7) is the ratio

0.6745(e - 1)+ 0.6745(e - 1) -e 0.6745(e - 1)81 )+ [ (

r(e)= \ (e+ 1) (e+1) e+ 1 (e+I)

where e = a/(- b) > 0. Note that

r(!) = r(e)

for all e > 0. Table 1(a) gives values of efficiency, r(e), for a range of values of e = a/( -b) > 0.

737




The two corresponding adjustments to M to minimize the mean and median predictive costs, respectively, are as given in Table l(b), where

0.6745(a + b) 0.6745(e - 1) I -

(a-b) (e + 1)

=2~ '(a - b) e +

and the adjustments to M are bla, 62 v. Note that

l =e) -(e) and 62Q) =-62(e).

TABLE 1(a)

e 1 1.5 2 3 4 5 dje) 1 1.007 1.022 1.061 1.106 1.156

e 6 7 10 100 1000 OO Pie) 1.207 1.259 1.421 6.072 44.597 0

TABLE 1(b)

e 1 1.5 2 3 4 5 6l 0 0.135 0.225 0.337 0.405 0.450 62 0 0.253 0.431 0.674 0.842 0.967

e 6 7 10 100 100 00 61 0.482 0.506 0.552 0.661 0.673 0.6745 62 1.068 1.150 1.335 2.334 3.090 OO

Observe that when a and - b are of similar magnitude, say

0.1 <a/(-b) < 10,

2= 23,

approximately. Within this range, the relative efficiency r is close to 1, indicating that the expected predictive costs of the two solutions are not dissimilar.

PREDICTION WITH AN ASYMMETRIC QUADRATIC COST FUNCTION

Consider the asymmetric quadratic cost function

fax 21 x > 0

g(x) = {bX2, x >

0O

where a > 0, b > 0. In this case,

gl(x) = ax2, g2(x) = bx2 and l,(u) = (a)I, 12(u) =-(-) for u > 0.

Constraint (4) gives

"* = [ 2(M-h ]2

[al/2 bl/2

and the solution of (6) is

h = M + 0.6745ar (a"/2 + b "2)'s (8)

738



M. Cain-Median Predictive Cost of Error

again an adjustment to the predictive mean, M. Clearly h > M if a > b and h < M if a < b. The second-order condition (5) becomes

d2u* 2u* --= 2> 0

dh2 a2

and, since this is satisfied, (8) gives the optimal solution. The problem of minimizing the mean predictive cost is not so analytically tractable. The

optimal h with this criterion, as stated by Granger,' is a solution of

a fx dFk(x + h) + b {x dFk(x + h) = 0. (9) 0 - 00

This reduces to

(a - b) ah M) + (M h) [a- (a - b)(h M)] =0,

which gives

h = M + av*,

where v* is a solution of

a T(v) - v b -_ (V)1= 0 (10)

or, equivalently, of

-4v _ d {Igo <F(v) - a } a dv a-b

Fv)-a - b

The corresponding (minimal) expected predictive cost is

a2(a -b)ov*)Iv*,

an alternative expression for which is

a2[a - (a -b)(v*)].

Observe that v* < 0 if a < b and v* > 0 if a > b. Equation (10) can be solved numerically, but clearly the median solution (8) is rather more convenient. The expected predictive cost of any adjustment h = M + ca (c constant) is

a(2{a(l + C2) - (a - bXl + C2) F(C) - (a - b)c4(c)},

and this applies to the median solution (8) with

c = 0.6745 (a"2 - b"2) (a"12 + b"/2)

CONCLUDING REMARKS

If a more complicated cost function obtains, then it appears that (6) and (4) are more amenable to solution than is the extension of (9):

g'(x-h) dFk(x) + {g'2(x - h) dFk(x) = 0;

compare (7) with (8) and 4 '(a/(a-2b)) with solving (10). For example, with a simple cubic or quartic cost function, there is an obvious simple extension of (8), but the corresponding extension of equation (10) is more complicated. Thus, if it is required to consider expected predictive cost, it

739




might be more convenient to find the solution minimizing the median predictive cost as a practical compromise, and adjust if necessary (cf. Table 1).

Moreover, the solution based on the median criterion can be less biased than that based on the mean and also less risk-averse; under certain circumstances, these attributes might be more desir- able. The median approach, of course, gives less weight to extreme values, and could be appropri- ate if the cost function is truncated or bounded, as in practice it inevitably must be.

REFERENCE

1. C. W. J. GRANGER (1969) Prediction with a generalized cost of error function. Opi Res. Q. 20, 199-207.

740



median predictive cost of error with an asymmetric cost function

Documents