lecture 9. unconstrained optimization need to maximize a function f(x), where x is a scalar or a...

31
Lecture 9

Post on 19-Dec-2015

224 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

Lecture 9

Page 2: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

Unconstrained Optimization

Need to maximize a function f(x), where x is a scalar or a vector

x = (x1, x2) f(x) = -x12 - x2

2

f(x) = -(x-a)2

f(x) = x

Page 3: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

Gradient Search Techniques

Suppose f(x) is differentiable, i.e., f(x) exists.

Then the series of updates xk+1 = xk + f(xk) reaches a local maxima starting from any initial value x0, for all sufficiently small values of .

A function f(x) has a local maxima at xk if f(xk) is less than or equal to f(y) for all y in some small neighborhood of xk

Page 4: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

Concave Function

Let C be a convex subset of Rn

A function f: CR is called concave if f( x + (1- )y) f(x) + (1- )f(y) for

all [0, 1] and x , y C

x

f(x)

f(x) + (1- )f(y)f( x + (1- )y)

x y x + (1- )y

Page 5: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

Strict Concave Function

Let C be a convex subset of Rn

A function f: CR is called strictly concave if f( x + (1- )y) f(x) + (1- )f(y) for

all [0, 1] and x , y C

A straight line is concave, but not strictly concave.

Any local maxima is a global maxima for a concave function.

A strictly concave function has a unique global maxima

Page 6: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

Constrained Optimization

Maximize a function f(x), where x must be in a given set S

Suppose, f(x) is linear in x.

Set S is specified by liner inequalities

Then this is a linear program

Page 7: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

Network Flow

v: (v, s) Exvs - v: (s, v) Exsv = -d

v: (v, t) Exvt - v: (t, v) Extv = d

v: (v, u) Exvu = v: (u, v) Exuv

0 xuv Cuv

Maximize the total output flow from the source s, d such that for every link (u, v)

Page 8: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

Network Flow as a Linear Program

Here, x is a flow allocation.

f(x) is the net flow out of the source under flow allocation x

f(x) = v: (s, v) Exsv - v: (v, s) Exvs

Set S is the set of feasible flows. Feasibility conditions are

0 xuv Cuv

v: (v, u) Exvu = v: (u, v) Exuv

Page 9: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

Linear ProgramTechniques

Simplex

Exponential Complexity in the worst case

However, LP is polynomial complexity computable.

If a problem can be modeled as LP, then this shows that the problem is polynomial complexity and not NP-hard.

Page 10: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

Convex OptimizationMaximize a function f(x), where x must be in a given set S

Function f(x) is concave

Set S is convex

Get rid of the constraints!

Consider a function g(x) such that g(x) is small if x is not in S and large otherwise.

Now try to maximize f(x) + g(x) without any constraints

Now f(x) + g(x) can be maximized by gradient search techniques if f(x) + g(x) is differentiable.

Page 11: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

Let us be more specific about set S.

A vector x belongs to set S if gi(x) 0, i=1,…M

We would like to maximize f(x) - i gi(x)

q() = maxx (f(x) - i gi(x))

min0 q()

What should be the value of ?

The output of the above minimization is the maximum value of f(x) subject to x S

Page 12: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

Primal-Dual Approach

Primal

Maximize f(x) subject to gi(x) 0, i=1,…M

Dual

q() = maxx (f(x) - i gi(x))

min0 q()

Page 13: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

Advantages of Dual Approach

If the dual is differentiable, then it can be solved by gradient search techniques.

For the convex programming we are considering, primal = dual

(may not hold if the feasible set is nonconvex).

Dual is always convex and hence has a unique global minima.

The dual is differentiable if the objective function f(x) is strictly concave.

Page 14: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

Flow Control

Sessions are normally greedy.

Would like to send as much traffic as possible

However, this would congest the network.

Need to regulate the flow of the sessions.

Flow of a session must depend on the bandwidth requirements and the revenues paid by the sessions.

Page 15: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

Every session has a utility function.

Utility is a function of the bandwidth. It reflects the value attached to the bandwidth by a session.

The objective of the network is to allocate bandwidths to maximize the sum of the utilities of the sessions.

The underlying assumption is that the network charges the users in accordance of the declared utility functions.

Page 16: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

Utility MaximizationUtility of session i is Ui(x)

Let there be N sessions.

Every session has a predetermined path.

Maximize Ui(ri) subject to

i:session i traverses link lri Cl

Page 17: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

This has been studied by several researchers:

F. Kelly, ``Charging and Rate Control for Elastic Traffic’’,European Transactions on Telecommunications, vol. 8, No. 1, 1997, pp 33-37

F. Kelly, A. Maulloo, D. Tan, ``Rate Control for Communications Networks: Shadow Prices, Proportional Fairness and Stability’’, Journal of Operations Research Society, vol. 49, No. 3, 1998, pp 237-52

S. Low and D. Lapsley, ``Optimization Flow Control I: Basic Algorithm and Convergence,’’ IEEE/ACM Transactions on Networking, vol. 7, No. 6, Dec. 1999

Page 18: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

R. La, V. Anantharam, ``Charge Sensitive TCP and Rate Control in the Internet,’’Proceedings of INFOCOM 2000, March 2000

S. Kunniyur, R. Srikant, ``End-to-End Congestion Control Schemes: Utility Functions, Random Losses and ECN Marks’’, Proceedings of INFOCOM 2000, March 2000

K. Kar, S. Sarkar, L. Tassiulas, ``A Simple Rate Control Algorithm for Maximizing Total User Utility,'' Proceedings of INFOCOM 2001, Alaska

Page 19: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

Dual based Approach

Maximize Ui(ri) subject to

i:session i traverses link lri Cl

Primal

Dual

L(r, p) = i Ui(ri) - lpl(i:session i traverses link lri - Cl)

= i (Ui(ri) - ril is on session i path pl) + lpl Cl

Page 20: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

D( p) = max r L (r, p)

= max r i (Ui(ri) - ril is on session i path pl) + lpl Cl

= i (Ui(ri(p)) - ri (p)l is on session i path pl) + lpl Cl

Ui’(ri (p)) = l is on session i path pl

The optimum is attained at minp0 D( p)

Then it turns out that the objective function Ui(ri) is strictly concave. It follows that the dual D(p) is differentiable.

Assumption: Utility functions are strictly concave.

Page 21: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

Hence, we can use gradient search to attain minp0 D( p)

pk+1 = (pk - D( p) )+ , where is sufficiently small

D( p) = i (Ui(ri(p)) - ri (p)l is on session i path pl) + lpl Cl

D( p) / pl = i ri (p) / pl (Ui’(ri(p)) - l is on session i path pl)+ Cl - i: session i traverses link l ri (p)

D( p)/ pl = Cl - i: session i traverses link l ri (p)

Page 22: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

Hence, pk+1l = (pk

l - (Cl - i: session i traverses link l ri (p)))+

Also, Ui’(ri (p)) = l is on session i path pl

That is, ri (p) = Ui’-1(l is on session i path pl)

Initially, choose p0l = 0 for all links l.

Update session rates as ri (0) = Ui’-1(0), for all sessions i

Update link prices as p1l = (Cl - i: session i traverses link l ri (p))+

Update session rates again and subsequently link rates etc.

Page 23: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

Finally, the dual minimum is attained, and the rates converge to those which attain the maximum utilities.

Page 24: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

Intuitive Explanation

pl is the link price.

Link price update:

pk+1l = (pk

l - (Cl - i: session i traverses link l ri (p)))+

New link price = Old link price - (Link Capacity – Sum of rates of sessions traversing the link)

Link price increases if there is congestion, and decreases if the link bandwidth is underutilized.

Page 25: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

Session rate update:

Session rate = Ui’-1(path price for session i)

Since utility functions are strictly concave, derivatives of utility functions are decreasing.

It follows that rate of a session increases if path price decreases for the session and decreases if session price increases.

Path price increases if there is congestion in the path, and decreases if there is underutilization.

So session rate increases if resources are under-utilized in its path, and decreases if there is congestion in its path

Page 26: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

Distributed Implementation

Link price update requires only the sum of the session rates in the link.

Session rate update requires only the sum of the link prices in its path.

So, a session source sends a control packet with 0 price value.

Every link increments this value by its link price

By the time the control packet reaches the receiver, the price value in the packet equals the path price for the session.

Page 27: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

Session rate is updated based on the price in the control packet.

Receiver communicates the rate to the session source.

Links can learn the session rate from the control packet traversing towards the source from the receiver,

And subsequently use these rates to update the link price.

Alternatively, source sends at the rate directed by the source.

Links measure the session rates from the number of incoming packets.

Scheduling may play a role in the convergence for the latter!

Page 28: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

Random Early Marking

A heuristic implementation of the previous optimization algorithm with one bit marking.

Athuraliya, Low and Lapsley, GLOBECOM 1999

In current internet, there is a proposition to include one bit in the header of every packet for congestion notification (ECN: explicit congestion notification)

If a router is congested, then it marks the bit for every packet traversing the router.

Page 29: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

When the marked packet reaches the receiver, it knows that the path is congested and asks the source to reduce the transmission rate.

REM suggests that a link l mark a packet of any session with probability 1-exp(-pl), where pl is the link price.

Probability that a packet is marked is 1 - l(1 – (1-exp(-pl)))

(assuming that packet markings in different links are independent events)

Probability that a packet is marked for a session i is 1 - exp (-l is on session i pathpl)

Page 30: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

This probability can be estimated from the fraction of marked packets reaching the receiver.

Clearly, path price for a session can be estimated from this probability.

Path price of a session = -ln(1 – fraction of marked packets)

Links estimate session rates from the packet arrival rates,

Links compute the link prices from these session rates

Use the link prices to probabilistically mark packets

Page 31: Lecture 9. Unconstrained Optimization Need to maximize a function f(x), where x is a scalar or a vector x = (x 1, x 2 ) f(x) = -x 1 2 - x 2 2 f(x) = -(x-a)

Receiver estimates the path price from fraction of marked packets

Update session rates on the basis of the estimated path price

Communicate session rates to the source.

Source sends packets accordingly.

Estimation error possible.

However, simulation results indicate that the actual rates oscillate in a neighborhood of the optimum rates.

No convergence proof.