engineering serendipity: successes in solitary foraging ... · the biomimicry institute david...

Engineering Serendipity Successes and New Investigations

Engineering Serendipity: Successes in SolitaryForaging and New Investigations in Cooperative

Task-Processing Networks

Theodore (Ted) P. Pavlic – The Ohio State University

Department of Electrical & Computer Engineering

Thursday, May 6, 2010

Overview

Introduction

Reflections

Faulty connections

Missed connections

Motivations

Solitary foraging: fromecology to engineeringand back

Cooperative taskprocessing

Closing remarks

IntroductionReflections

Faulty connections

Missed connections

Motivations

Solitary foraging: from ecology to engineering and backSpeed choice (→)

Impulsiveness and operant conditioning (←)

Long patch residence times (←)

Cooperative task processingBackground

Task-processing network

Cooperation game

Asynchronous convergence to cooperation

Results

Closing remarks

Manufacturing Serendipity

Introduction

Reflections

Faulty connections

Missed connections

Motivations

Closing remarks

Craig Tovey: “manufacturing serendipity”

1Compliments to XKCD: http://xkcd.com/720/.

Engineering Manufacturing Serendipity

Introduction

Reflections

Faulty connections

Missed connections

Motivations

Closing remarks

Craig Tovey: success with bees and Internet server allocation

1Compliments to XKCD: http://xkcd.com/720/.

Faulty connections

Introduction

Reflections

Faulty connections

Missed connections

Motivations

Closing remarks

The Biomimicry Institute

David Brazier

Faulty connections

Introduction

Reflections

Faulty connections

Missed connections

Motivations

Closing remarks

David Brazier

Faulty connections

Introduction

Reflections

Faulty connections

Missed connections

Motivations

Closing remarks

David Brazier(R. Wehner)

Missed Abstract Connections

Introduction

Reflections

Faulty connections

Missed connections

Motivations

Closing remarks

(Important?) Missed Abstract Connections

Introduction

Reflections

Faulty connections

Missed connections

Motivations

Closing remarks

IFD (Fretwell 1972; Fretwell and Lucas 1969) ⇐⇒ Optimal power

dispatch (Bergen and Vittal 2000)

Introduction

Reflections

Faulty connections

Missed connections

Motivations

Closing remarks

Economic dispatch problem:

minimize

Ci(Pi) subject to

Pi = P

Introduction

Reflections

Faulty connections

Missed connections

Motivations

Closing remarks

minimize

Ci(Pi) subject to

Pi = P

Pareto minimization of costs subject to conservation simplex.

Introduction

Reflections

Faulty connections

Missed connections

Motivations

Closing remarks

minimize

Ci(Pi) subject to

Pi = P

Solution (from KKT) is an “upside-down” IFD:

dCi(Pi)

dPi= λ ∀i ∈ 1, 2, . . . , n (and truncate appropriately)

Equalization of marginal cost matches IFD equalization of suitability.

Introduction

Reflections

Faulty connections

Missed connections

Motivations

Closing remarks

minimize

Ci(Pi) subject to

Pi = P

Solution (from KKT) is an “upside-down” IFD:

dCi(Pi)

dPi= λ ∀i ∈ 1, 2, . . . , n (and truncate appropriately)

Equalization of marginal cost matches IFD equalization of suitability.

[ Conical cost combination with simplex constraint set has simple

solution in dual space (i.e., solve for λ). ]

Introduction

Reflections

Faulty connections

Missed connections

Motivations

Closing remarks

IFD as optimization problem (thought experiment):

maximize

∫ xi

0si(y) dy subject to

xi = N

Pareto maximization of (???) subject to conservation simplex.

Right-side-up IFD:

si(xi) = λ ∀i ∈ 1, 2, . . . , n (and truncate appropriately)

Distribute xi to equalize suitability.

Introduction

Reflections

Faulty connections

Missed connections

Motivations

Closing remarks

maximize

Gi(xi) subject to

xi = N

Pareto maximization of gain(?) subject to conservation simplex.

Right-side-up IFD:

dGi(xi)

dxi= λ ∀i ∈ 1, 2, . . . , n (and truncate appropriately)

Distribute xi to equalize marginal gain.

Introduction

Reflections

Faulty connections

Missed connections

Motivations

Closing remarks

maximize

Gi(ti) subject to

ti = T

Maximization of distributed gain subject to limited time inside patch.

Right-side-up IFD:

dGi(ti)

dti= λ ∀i ∈ 1, 2, . . . , n (and truncate appropriately)

Distribute ti to equalize marginal gain.

Introduction

Reflections

Faulty connections

Missed connections

Motivations

Closing remarks

Risk-sensitive foraging (Stephens and Charnov 1982; Stephens and

Krebs 1986) ⇐⇒ Sharpe ratio/MPT (Sharpe 1966, 1994)

Introduction

Reflections

Faulty connections

Missed connections

Motivations

Closing remarks

Sharpe (Nobel prize, Economics, 1990) ratio:

E(R)−Rf

Introduction

Reflections

Faulty connections

Missed connections

Motivations

Closing remarks

E(R)−Rf

Exactly the Z-score ranking method of risk-sensitive foraging theory.

Introduction

Reflections

Faulty connections

Missed connections

Motivations

Closing remarks

E(R)−Rf

Exactly the Z-score ranking method of risk-sensitive foraging theory.

MPT (then)→ PMPT (now) (stochastic dominance, Bawa 1982)

Introduction

Reflections

Faulty connections

Missed connections

Motivations

Closing remarks

Iain Couzin’s (2000+) ⇐⇒ Bertsekas and Tsitsiklis (1997−) (e.g.,

mysterious torus shapes; symmetry; symmetry breaking)

The Tenth Muse

Introduction

Reflections

Faulty connections

Missed connections

Motivations

Closing remarks

Solitary foraging: from ecology toengineering and back

Introduction

Speed choice (→)

Impulsiveness andoperant conditioning(←)

Long patch residencetimes (←)

Closing remarks

Foraging theory for speed choice

Introduction

Speed choice (→)

Closing remarks

Nice homomorphism between solitary foragers and

autonomous vehicles (Andrews et al. 2004; Charnov

1973; Quijano et al. 2006; Stephens and Krebs 1986)

g − c

τ− 1

−cscs

Introduction

Speed choice (→)

Closing remarks

Fitness surrogate (e.g., calories, target value)

Diverse collection of targets

Opportunity cost: some should be ignored

Rate maximization for long runs

Target/task choice ⇐⇒ prey model

g − c

τ− 1

−cscs

Introduction

Speed choice (→)

Closing remarks

Vehicle speed choice is very similar to cryptic prey

problem described by Gendron and Staddon (1983)

Ceteris paribus, encounter rate increases with

search speed

Search cost increases with search speed

Detection mistakes may vary with speed

Non-trivial speed–prey choice coupling

Prey =⇒ speed =⇒ rate =⇒ prey

Bobwhite quail(Gendron andStaddon 1983)

Les Howard

Introduction

Speed choice (→)

Closing remarks

Vehicle speed choice is very similar to cryptic prey

problem described by Gendron and Staddon (1983)

Bobwhite quail(Gendron andStaddon 1983)

Les Howard

To match bobwhite quail observations, Gendron and Staddon

choose detection function P di (u) , (1− (u/umax)

Ki)1/Ki that

maps search speed u ∈ [0, umax] to detection probability P di for

tasks of type i with conspicuousness Ki ∈ [0,∞).

No analytical tractability

Chose n = 2 for simulation (1983)

P di is strange at bounds (1 and 0) u/umax

P di (u)

On-line prey–speed choice for n ∈ N

(Pavlic 2007; Pavlic and Passino 2009)

Introduction

Speed choice (→)

Closing remarks

Autonomous vehicle faces n-way merged Poisson process

λi: encounter rate for task of type i

(gi, ti): average (value, time) for processing task of type i

pi: probability that task of type i is processed (decision)

cs: cost per-unit-time of searching

Introduction

Speed choice (→)

Closing remarks

Vehicle goes through cycles of searching and processing

G: average per-encounter gain

T : average per-encounter search and processing time

G(t): Markov renewal–reward process for accumulated gain

Introduction

Speed choice (→)

Closing remarks

Long runtime =⇒ maximize rate of return

aslimt→∞

−cs +n∑

i=1λipigi

1 +n∑

i=1λipiti

, R(p)

As expected, type-II functional response (Holling’s disk equation

without any sandpaper disks).

Introduction

Speed choice (→)

Closing remarks

In general, pi ∈ [0, 1], but

∂R(p)

∂pi=

1 +n∑

j=1λjpjtj

− λiti

−cs +n∑

j=1λjpjgj

1 +n∑

i=1λipiti

Introduction

Speed choice (→)

Closing remarks

So KKT reveals optimization is purely O(2n) combinatorial

∂R(p)

∂pi=

1 +n∑

j=1j 6=i

λjpjtj

− λiti

−cs +n∑

j=1j 6=i

λjpjgj

1 +n∑

i=1λipiti

So-called zero–one rule because p∗i ∈ 0, 1

Introduction

Speed choice (→)

Closing remarks

So KKT reveals optimization is purely O(2n) combinatorial

∂R(p)

∂pi=

1 +n∑

j=1j 6=i

λjpjtj

− λiti

−cs +n∑

j=1j 6=i

λjpjgj

1 +n∑

i=1λipiti

So-called zero–one rule because p∗i ∈ 0, 1

[ Sufficient condition; not necessary ]

Introduction

Speed choice (→)

Closing remarks

Classical prey ranking refines search from O(2n) to O(n+ 1)

Processed types (p∗i = 1)︷︸︸︷g1t1

> . . . >gk∗

tk∗>

Optimal rate R(p∗)︷︸︸︷

−cs +k∗∑

i=1λigi

1 +k∗∑

i=1λiti

Ignored types (p∗i = 0)︷︸︸︷gk∗+1

tk∗+1> . . . >

where optimal p∗i = [i ≤ k∗] with k∗ ∈ 0, 1, . . . , n

Introduction

Speed choice (→)

Closing remarks

Classical prey ranking does not depend on λ (i.e., speed)

Processed types (p∗i = 1)︷︸︸︷g1t1

> . . . >gk∗

tk∗>

Optimal rate R(p∗)︷︸︸︷

−cs +k∗∑

i=1λigi

1 +k∗∑

i=1λiti

Ignored types (p∗i = 0)︷︸︸︷gk∗+1

tk∗+1> . . . >

where optimal p∗i = [i ≤ k∗] with k∗ ∈ 0, 1, . . . , n

Effects of speed

Introduction

Speed choice (→)

Closing remarks

Speed u ∈ [umin, umax] ⊂ [0,∞) influences each encounter rate

λi(u) = uDiPdi (u)

where Di is the linear density in the population

Effects of speed

Introduction

Speed choice (→)

Closing remarks

λi(u) = uDiPdi (u)

Detection function is linear interpolation of probability bounds

P di (u)

umin umax

P di (u) = P ℓ

i u+ P ai

Effects of speed

Introduction

Speed choice (→)

Closing remarks

λi(u) = uDiPdi (u)

P di (u)

umin umax

P di (u) = P ℓ

i u+ P ai

Search cost is also assumed to be affine function

cs(u) = csℓu+ csa

Effects of speed

Introduction

Speed choice (→)

Closing remarks

λi(u) = uDiPdi (u)

P di (u)

umin umax

P di (u) = P ℓ

i u+ P ai

ci(u) = cℓiu+ cai

[ Processing costs can be modeled in a similar way ]

Effects of speed

Introduction

Speed choice (→)

Closing remarks

λi(u) = uDiPdi (u)

P di (u)

umin umax

P di (u) = P ℓ

i u+ P ai

cs(u) = csℓu+ csa

[ Processing costs can be modeled in a similar way ]

[ . . . but not here. ]

Value of speed

Introduction

Speed choice (→)

Closing remarks

After regrouping, new objective function

R(p, u) =G2(p)u

2 +G1(p)u+G0(q)

T2(p)u2 + T1(p)u+ 1

where coefficients

G2(p) ,n∑

DipigiPℓi

G1(p) ,

DipiPai gi − csℓ

G0(p) , −csa

T2(p) ,

pitiDiPℓi

T1(p) ,n∑

pitiDiPai

are constant with respect to u (i.e., biquadratic ratio)

Value of speed

Introduction

Speed choice (→)

Closing remarks

After regrouping, new objective function

R(p, u) =G2(p)u

2 +G1(p)u+G0(q)

T2(p)u2 + T1(p)u+ 1

where coefficients

G2(p) ,n∑

DipigiPℓi

G1(p) ,

DipiPai gi − csℓ

G0(p) , −csa

T2(p) ,

pitiDiPℓi

T1(p) ,n∑

pitiDiPai

are constant with respect to u (i.e., biquadratic ratio)

Find optimal u∗ for each p∗ candidate (n+ 1 total)

Finding optimal speed

Introduction

Speed choice (→)

Closing remarks

Because biquadratic objective, for each p∗ candidate,

∂R(u)

(G2T1 −G1T2)u2 + 2(G2 −G0T2)u+ (G1 −G0T1)(

T2u2 + T1u+ 1

By KKT, if quadratic numerator root u∗ ∈ [umin, umax], then u∗ is

optimal speed; otherwise, optimal speed u∗ ∈ umin, umax based

on sign of numerator

Introduction

Speed choice (→)

Closing remarks

∂R(u)

(G2T1 −G1T2)u2 + 2(G2 −G0T2)u+ (G1 −G0T1)(

T2u2 + T1u+ 1

Implement O(n+ 1) algorithm on-line if Di density estimates

available (Dubin’s car AAV simulations with speed filtering, Pavlic

and Passino 2009)

Introduction

Speed choice (→)

Closing remarks

∂R(u)

(G2T1 −G1T2)u2 + 2(G2 −G0T2)u+ (G1 −G0T1)(

T2u2 + T1u+ 1

Implement O(n+ 1) algorithm on-line if Di density estimates

available (Dubin’s car AAV simulations with speed filtering, Pavlic

and Passino 2009)

Non-trivial to guarantee convergence of density estimates on-line

Estimation process =⇒ type-II type-III functional response

Estimation, impulsiveness, and the operant laboratory(Pavlic and Passino 2010c)

Introduction

Speed choice (→)

Closing remarks

Laboratory impulsiveness (Ainslie 1974; Bateson and Kacelnik

1996; Bradshaw and Szabadi 1992; Green et al. 1981; McDiarmid

and Rilling 1965; Rachlin and Green 1972; Siegel and Rachlin 1995;

Snyderman 1983; Stephens and Anderson 2001)

Introduction

Speed choice (→)

Closing remarks

Laboratory impulsiveness

Using starvation, animals are trained to use a Skinner box

Repeat mutually exclusive binary-choice trials (at low weight)

Introduction

Speed choice (→)

Closing remarks

What can be inferred about Skinner box results?

Usually assume simultaneous encounters occur with

probability zero (Poisson assumption)

Mutually exclusive choices when prey is immobile?

Patch impulsiveness vanishes (Stephens et al. 2004)

Attention (Monterosso and Ainslie 1999; Siegel and Rachlin 1995)

Introduction

Speed choice (→)

Closing remarks

Worst-case scenario for a robot

Predisposes robots to underestimate

Introduction

Speed choice (→)

Closing remarks

Worst-case scenario for an animal?

Predisposes animals to underestimate?

Introduction

Speed choice (→)

Closing remarks

Estimation of per-type densities only necessary for speed regulation

Introduction

Speed choice (→)

Closing remarks

Graphical description of optimal prey choice:

Processing time

Processinggain

For type i: or @ (processing time ti, gain g(ti))

(t1, g1)

(t2, g2)

(t3, g3)

(t4, g4)

(t5, g5)

Process

Ignore

Introduction

Speed choice (→)

Closing remarks

Processing time

Processinggain

(t1, g1)

(t2, g2)

(t3, g3)

(t4, g4)

(t5, g5)

Process

Ignore

t: Total search and processing time

Accumulatednet

Type-# encounter: # (Process) or # (Ignore)

Process encounter k when gi(k)/ti(k) > G(t(k))/t(k)

Introduction

Speed choice (→)

Closing remarks

Processing time

Processinggain

(t1, g1)

(t2, g2)

(t3, g3)

(t4, g4)

(t5, g5)

Process

Ignore

Accumulatednet

Rule (even with mistakes) is optimal facing Poisson encounters (i.e.,

simultaneous w.p.0)

Introduction

Speed choice (→)

Closing remarks

Processing time

Processinggain

(t1, g1)

(t2, g2)

(t3, g3)

(t4, g4)

(t5, g5)

Process

Ignore

Accumulatednet

simultaneous w.p.0)

Digestive rate constraints (bi: prey bulk) (Hirakawa 1995):

i=1λipibi

1 +n∑

i=1λipiti

≤ BKKT=⇒

p∗1 = 1

p∗k∗−1 = 1

p∗k∗ ∈ [0, 1]

Partial Preferences(rank by gi/bi)

Digression

Introduction

Speed choice (→)

Closing remarks

Processing time

Processinggain

(t1, g1)

(t2, g2)

(t3, g3)

(t4, g4)

(t5, g5)

Process

Ignore

Accumulatednet

simultaneous w.p.0)

Ecological–physiological hybrid method (Whelan and Brown 2005):

Asymptotic gut constraint ⇐⇒ Rank bygi

ti + tbi

Digression

Introduction

Speed choice (→)

Closing remarks

Processing time

Processinggain

(t1, g1)

(t2, g2)

(t3, g3)

(t4, g4)

(t5, g5)

Process

Ignore

Accumulatednet

simultaneous w.p.0)

ti + tbi

Process encounter k when gi(k)/(ti(k) + tbi(k)) > G(t(k))/t(k)

t: Search and non-ballast handling time

Accumulatednet

t1 + tb1

t2 + tb2

Handling and search time (no ballast time)Accumulatednet

i : Digestive profitability line for type i

t1 + tb1

t5 + tb5

Digression

Introduction

Speed choice (→)

Closing remarks

Processing time

Processinggain

(t1, g1)

(t2, g2)

(t3, g3)

(t4, g4)

(t5, g5)

Process

Ignore

Accumulatednet

simultaneous w.p.0)

ti + tbi

Accumulatednet

t1 + tb1

t2 + tb2

0 1 2 3 4 5

Types (ordered by digestive profitability)Proportion

ofencounters

accepted

PartialPreferences

Digression

Introduction

Speed choice (→)

Closing remarks

Processing time

Processinggain

(t1, g1)

(t2, g2)

(t3, g3)

(t4, g4)

(t5, g5)

Process

Ignore

Accumulatednet

simultaneous w.p.0)

ti + tbi

Accumulatednet

t1 + tb1

t2 + tb2

Handling and search time (no ballast time)Accumulatednet

i : Digestive profitability line for type i

t1 + tb1

t5 + tb5

SlidingMode

Digression

Introduction

Speed choice (→)

Closing remarks

Accumulatednet

Attention: simultaneous encounter (w.p.0) =⇒ low time first

Introduction

Speed choice (→)

Closing remarks

Accumulatednet

R∗type-2-only

Introduction

Speed choice (→)

Closing remarks

Accumulatednet

R∗depressed

Attention: simultaneous encounter (w.p.1) =⇒ either first

Bifurcation; lucky runs accumulate high initial estimate

Introduction

Speed choice (→)

Closing remarks

Accumulatednet

R∗type-2-only

Rescue optimality with early ad libitum feeding

Sunk costs and long patch residence times(Pavlic and Passino 2010b)

Introduction

Speed choice (→)

Closing remarks

Nolet et al. (2001) are unable to explain spatial differences in tundra

swan foraging

Introduction

Speed choice (→)

Closing remarks

swan foraging

In shallow water, swans feeding on tubers can “head dip”

In deep water, they must “up end,” which requires more energy

Nolet et al. find it strange that swans spend longer at the more

energetic task

Gemma Longman

Derek Tsang

Introduction

Speed choice (→)

Closing remarks

swan foraging

energetic task

Other sunk cost/Concorde effects (Arkes and Blumer 1985;

Arkes and Ayton 1999; Dawkins and Carlisle 1976; Kanodia

et al. 1989; Staw 1981)

Introduction

Speed choice (→)

Closing remarks

swan foraging

energetic task

Other sunk cost/Concorde effects (Arkes and Blumer 1985;

Arkes and Ayton 1999; Dawkins and Carlisle 1976; Kanodia

et al. 1989; Staw 1981)

Observations consistent with rate maximization when patch entry

costs are modeled

Introduction

Speed choice (→)

Closing remarks

swan foraging

costs are modeled. For n = 1,

R(t1) =g1(t1)1λ1

+ t1where a < b < c , g1(0) < 0

g1(t1)

t1−a− 1

t∗1a

Due to entry costs, searching is a less desirable task

Introduction

Speed choice (→)

Closing remarks

swan foraging

R(t1) =g1(t1)1λ1

+ t1where a < b < c , g1(0) < 0

g1(t1)

t1−a

− 1λ1

t∗1bt∗1a

Introduction

Speed choice (→)

Closing remarks

swan foraging

R(t1) =g1(t1)1λ1

+ t1where a < b < c , g1(0) < 0

g1(t1)

− 1λ1

t∗1ct∗1b

Introduction

Speed choice (→)

Closing remarks

swan foraging

R(t1) =g1(t1)1λ1

+ t1where a < b < c , g1(0) < 0

g1(t1)

− 1λ1

t∗1ct∗1b

May explain overstaying as well (Nonacs 2001)

Cooperative task processing

Introduction

Background

Task-processingnetwork

Cooperation game

Asynchronousconvergence tocooperation

Results

Closing remarks

Cooperation for distributed decentralized networks

Introduction

Background

Cooperation game

Results

Closing remarks

Cooperative control usually involves coordination of agents on

(possibly ad hoc) networks

e.g., Global utility functions to maximize

e.g., Projections onto non-separable spaces (i.e., not

Cartesian products)

Challenges to fast and cheap implementation

Introduction

Background

Cooperation game

Results

Closing remarks

Nash (i.e., competitive) equilibria are solutions to separable

variational inequality problems

Amenable to parallel solvers

Used by communication theorists on networks for congestion

control (Altman et al. 2005a,b; Buttyan and Hubaux 2003;

Shakkottai et al. 2006)

Strong connection to biological (and sociological) models of

emergent cooperation in nature

Introduction

Background

Cooperation game

Results

Closing remarks

Used in control to model unknown/unknowable

Typically used in control to model noise or enemy movements

(e.g., worst-case scenarios) or actions of humans in the system

Task conservation is a challenge to communication-like

application of Nash methods to task flow control

Introduction

Background

Cooperation game

Results

Closing remarks

Existing task-processing networks (TPN) (Cruz 1991; Perkins and

Kumar 1989) focus on robustness, not optimality:

Flexible manufacturing system, network components =⇒bounded queues/burstiness

Behaviors are static (i.e., no feedback)

Introduction

Background

Cooperation game

Results

Closing remarks

Existing task-processing networks (TPN) (Cruz 1991; Perkins and

Kumar 1989) focus on robustness, not optimality:

So here, elements merged from communication, TPN, and possible

analogous systems in nature (e.g., Cooperative breeding, Hamilton

and Taborsky 2005)

Try to design system so that Nash equilibrium has

characteristics that are globally favorable

Definition(Pavlic and Passino 2010a)

Introduction

Background

Cooperation game

Results

Closing remarks

A task-processing network is a directed graph:

A ⊂ N: Set of task-processing agents

P ⊆ (i, j) ∈ A2 : i 6= j: Directed arcs connecting distinct agents

Vi , j ∈ A : (j, i) ∈ P: Set of conveyors for each i ∈ A

Ci , j ∈ A : (i, j) ∈ P: Set of cooperators for each i ∈ A

V , j ∈ A : Cj 6= ∅: Set of all conveyors

C , i ∈ A : Vi 6= ∅: Set of all cooperators

Task flows at each agent:

Yi ⊂ N: Possibly empty set of task types that arrive at conveyor i ∈ A

λkj ∈ R>0 : Encounter rate of type-k tasks at agent j ∈ A (e.g., Poisson encounters)

πkj ∈ [0, 1]: Probability that conveyor j ∈ A advertises an incoming k-type task to its connected cooperators Cj

γi ∈ [0, 1]: Probability that cooperator i ∈ A volunteers for advertised task from one of its connected conveyors Vi(collected in γ)

TPN examples(Pavlic and Passino 2010a)

Input streams (k ∈ Yj ⊆ 1, 2, 3):

Conveyors (j ∈ V = 1, 2): V3 = 1 1 V4 = 1, 2 2 V5 = 2

Cooperators (i ∈ C = 3, 4, 5): 3 C1 = 3, 4 4 C2 = 4, 5 5

Y1 = 1, 2 Y2 = 1, 2, 31 2 1 2 3

π32γ3 γ4 γ5

Rate λkj λ1

1 λ21 λ1

2 λ22 λ3

Send request @ πkj

Accept request @ γi

Taskarrivals

Nodes andprocessingrequests

Flexible manufacturing system (FMS)

Input streams (k ∈ Yj ⊆ 1, 2, 3):

Conveyors (j ∈ V = 1, 2): V3 = 1 1 V4 = 1, 2 2 V5 = 2

Cooperators (i ∈ C = 3, 4, 5): 3 C1 = 3, 4 4 C2 = 4, 5 5

Y1 = 1, 2 Y2 = 1, 2, 31 2 1 2 3

π32γ3 γ4 γ5

Rate λkj λ1

1 λ21 λ1

2 λ22 λ3

Send request @ πkj

Accept request @ γi

Taskarrivals

Nodes andprocessingrequests

Flexible manufacturing system (FMS)

Cooperative breeders?

×××

λ22 λ3

AAV patrol scenario

⇐⇒

C = V = 1, 2, 3

Ci = Vi = 1, 2, 3 − i

Yj = j

λ22 λ3

AAV TPN

Cooperation gameMetrics of volunteering

Introduction

Background

Cooperation game

Results

Closing remarks

Need to develop an agent-based metric of performance that

catalyzes cooperation

Introduction

Background

Cooperation game

Results

Closing remarks

Following foraging example, define utility function Ui(γ) based on

rate of gain

Introduction

Background

Cooperation game

Results

Closing remarks

Following foraging example, define utility function Ui(γ) based on

rate of gain

To simplify presentation of combinatorial volunteering analysis,

introduce SOBP and SOMS.

Introduction

Background

Cooperation game

Results

Closing remarks

introduce SOBP and SOMS.

I : finite index set

Ω , γii∈I : indexed family with γi ∈ [0, 1] for each i ∈ I

For g, h ∈ N and Γ ⊆ I ,

SOBPg(Γ) ,

|Γ|∑

g + ℓ

C⊆Γ|C|=ℓ

k∈Γ−C

(1− γk)

SOMSh(Γ) ,

|Γ|∑

(−1)ℓ1

h+ ℓ

C⊆Γ|C|=ℓ

Several useful relationships between SOBP and SOMS.

Introduction

Background

Cooperation game

Results

Closing remarks

introduce SOBP and SOMS. For Γ ⊆ A,

SOBP1(i, k, ℓ − i)

= (1− γk)(1− γℓ) +1

2γk(1− γℓ) +

2γℓ(1− γk) +

3γkγℓ

(i.e., sum of binomial products)

For conveyor j ∈ V and cooperator i ∈ Cj = i, k, ℓ,SOBP1(i, k, ℓ − i) is probability that i is chosen to process

an advertised task from j ∈ Vi (given that it volunteered)

Introduction

Background

Cooperation game

Results

Closing remarks

introduce SOBP and SOMS. For Γ ⊆ A,

SOBP1(i, k, ℓ − i)

= (1− γk)(1− γℓ) +1

2γk(1− γℓ) +

2γℓ(1− γk) +

3γkγℓ

(i.e., sum of binomial products)

For conveyor j ∈ V and cooperator i ∈ Cj = i, k, ℓ,SOBP1(i, k, ℓ − i) is probability that i is chosen to process

an advertised task from j ∈ Vi (given that it volunteered)

SOMS gives curvature information about SOBP

Properties of SOMS and SOBP provide bounds for convergence

analysis (i.e., Lyapunov/non-deterministic set stability)

Cooperation gameAgent utility function – rate of gain

For i ∈ C, the rate of gain

Ui(γ) ,

Conveyor part — constant with respect to γi︷︸︸︷

1 −∏

j∈Ci

(1 − γj)

︸︷︷︸

Pr(Volunteer from Ci|Advertisement from i)

ri + γi

j∈Vi

Pr(i awarded task from j|i volunteers)︷︸︸︷

SOBP1(Cj − i)cij)

︸︷︷︸

Cooperator part

Ui(γ) ,

1 −∏

j∈Ci

(1 − γj)

︸︷︷︸

ri + γi

j∈Vi

SOBP1(Cj − i)cij)

︸︷︷︸

Cooperator part

bi ,∑

k∈Yi

bki − c

ri ,∑

k∈Yi

λki π

rki −

bki − c

are the costs and benefits of local process-

ing on i ∈ V

Ui(γ) ,

1 −∏

j∈Ci

(1 − γj)

︸︷︷︸

ri + γi

j∈Vi

SOBP1(Cj − i)cij)

︸︷︷︸

Cooperator part

cij ,∑

k∈Yj

λkj π

are the costs and benefits to i ∈ C for vol-

unteering for tasks exported from j ∈ Vi

Ui(γ) ,

1 −∏

j∈Ci

(1 − γj)

︸︷︷︸

ri + γi

j∈Vi

SOBP1(Cj − i)cij)

︸︷︷︸

Cooperator part

bi ,∑

k∈Yi

bki − c

ri ,∑

k∈Yi

λki π

rki −

bki − c

ing on i ∈ V

cij ,∑

k∈Yj

λkj π

Ui(γ) ,

1 −∏

j∈Ci

(1 − γj)

︸︷︷︸

ri −Qipi(Qi) + γi

j∈Vi

(pij(Qj) −

SOBP1(Cj − i)cij)

︸︷︷︸

Cooperator part — γi and Qj vary with γi

bi ,∑

k∈Yi

bki − c

ri ,∑

k∈Yi

λki π

rki −

bki − c

pi(Qi) ,∑

k∈Yi

λki π

ki (Qi)

ing on i ∈ V

cij ,∑

k∈Yj

λkj π

pij(Qj) ,∑

k∈Yj

λkj π

kj (Qj)

Fictitious payment functions added as stabilizing controls (Qi ,∑

j∈Ciγj )

Ui(γ) ,

1 −∏

j∈Ci

(1 − γj)

︸︷︷︸

ri −Qipi(Qi) + γi

j∈Vi

(pij(Qj) −

SOBP1(Cj − i)cij)

︸︷︷︸

Cooperator part — γi and Qj vary with γi

bi ,∑

k∈Yi

bki − c

ri ,∑

k∈Yi

λki π

rki −

bki − c

pi(Qi) ,∑

k∈Yi

λki π

ki (Qi)

ing on i ∈ V

cij ,∑

k∈Yj

λkj π

pij(Qj) ,∑

k∈Yj

λkj π

kj (Qj)

Fictitious payment functions added as stabilizing controls (Qi ,∑

j∈Ciγj )

Cournot oligopolies on a graph

Nash equilibriumExistence, uniqueness, and asynchronous convergence

Introduction

Background

Cooperation game

Results

Closing remarks

Natural choice for distributed variational inequality is local gradient

ascent

Introduction

Background

Cooperation game

Results

Closing remarks

ascent

Asynchronous system is governed by difference inclusion (not

difference equation)

Introduction

Background

Cooperation game

Results

Closing remarks

ascent

For set stability, sufficient to show synchronous system is a

contraction mapping

Also gives existence and uniqueness of Nash equilibrium

Introduction

Background

Cooperation game

Results

Closing remarks

ascent

contraction mapping

Because γ ∈ [0, 1]|C| comes from product topology of intervals,

must use block maximum norm (‖γ‖∞ , maxi∈C|γi|)

Introduction

Background

Cooperation game

Results

Closing remarks

ascent

contraction mapping

Because γ ∈ [0, 1]|C| comes from product topology of intervals,

must use block maximum norm (‖γ‖∞ , maxi∈C|γi|)

Procedure leads to constraints on payment functions and topology

Asynchronous convergence to Nash equilibriumPayment and topological constraints

Introduction

Background

Cooperation game

Results

Closing remarks

Assume that (Payment and topological constraints):

Introduction

Background

Cooperation game

Results

Closing remarks

1. For all i ∈ C and j ∈ Vi, pij is a stabilizing payment function

For k ∈ N, p′(Q) , dp(Q)/dQ < 0 for all Q ∈ [0, k]

For k ∈ N, p′′(Q) , d2p(Q)/dQ2 > 0 for all Q ∈ [0, k]

For k ∈ N, γp′′(Q) ≤ −p′(Q) for all Q ∈ [γ, k − (1− γ)]with γ ∈ [0, 1]

Introduction

Background

Cooperation game

Results

Closing remarks

pℓ(Q)

0 1 2 k

τ > 1

0 1 2 k

−Aτ

κ > 0

ε > κ+ 1

0 1 2 k

−Aκεκ

q0 > k + p− 1

0 1 2 k

−Apq0

Sample stabilizing payment (inverse demand) functions

Introduction

Background

Cooperation game

Results

Closing remarks

pℓ(Q)

0 1 2 k

τ > 1

0 1 2 k

−Aτ

κ > 0

ε > κ+ 1

0 1 2 k

−Aκεκ

q0 > k + p− 1

0 1 2 k

−Apq0

2. For all j ∈ V , |Cj| ≤ 3 (i.e., no conveyor can have more than 3

outgoing links to cooperators)

Introduction

Background

Cooperation game

Results

Closing remarks

pℓ(Q)

0 1 2 k

τ > 1

0 1 2 k

−Aτ

κ > 0

ε > κ+ 1

0 1 2 k

−Aκεκ

q0 > k + p− 1

0 1 2 k

−Apq0

2. For all j ∈ V , |Cj| ≤ 3 (i.e., no conveyor can have more than 3

outgoing links to cooperators)

3. For cooperator i ∈ C and j ∈ Vi, if j is a 3-conveyor (i.e.,

|Cj| = 3), then there must be some conveyor k ∈ Vi that is a

2-conveyor

Asynchronous convergence to Nash equilibriumOther example stable topologies

Introduction

Background

Cooperation game

Results

Closing remarks

λ1010

λ1111

λ88 λ9

Rich yet stable task-processing network.

Introduction

Background

Cooperation game

Results

Closing remarks

λ1010

λ1111

λ88 λ9

“Pills” stabilize problematic areas by focussing attention

Introduction

Background

Cooperation game

Results

Closing remarks

λ1010

λ1111

λ88 λ9

“Pills” stabilize problematic areas by focussing attention

Future research direction: Stable network motifs

Asynchronous convergence to Nash equilibriumTotally asynchronous algorithm

Introduction

Background

Cooperation game

Results

Closing remarks

Define T : [0, 1]n 7→ [0, 1]n by T (γ) , (T1(γ), T2(γ), . . . , Tn(γ)) where, foreach i ∈ C,

Ti(γ) , min1,max0, γi + σi∇iUi(γ)

(i.e., projected gradient ascent)

Introduction

Background

Cooperation game

Results

Closing remarks

(i.e., projected gradient ascent), where

≥ 2|Vi|maxk∈Vi

|p′ik(0)|

for all γ ∈ [0, 1]n.

Introduction

Background

Cooperation game

Results

Closing remarks

≥ 2|Vi|maxk∈Vi

|p′ik(0)|

for all γ ∈ [0, 1]n. If

minj∈Vi

|p′ij (|Cj |) | >

|Vi| −1

maxj∈Vi

|cij |, for all i ∈ C,

then the totally asynchronous distributed iteration (TADI) sequence γ(t)generated with mapping T and the outdated estimate sequence γi(t) for alli ∈ C each converge to the unique Nash equilibrium of the cooperation game.

Introduction

Background

Cooperation game

Results

Closing remarks

≥ 2|Vi|maxk∈Vi

|p′ik(0)|

for all γ ∈ [0, 1]n. If (∝ Hamilton’s rule on networks)

Benefit︷︸︸︷

minj∈Vi

|p′ij (|Cj |) | >

Relatedness︷︸︸︷(

|Vi| −1

) Cost︷︸︸︷

maxj∈Vi

|cij |, for all i ∈ C,

then the totally asynchronous distributed iteration (TADI) sequence γ(t)generated with mapping T and the outdated estimate sequence γi(t) for alli ∈ C each converge to the unique Nash equilibrium of the cooperation game.

Asynchronous convergence to Nash equilibriumResults: cooperation by cyclic feedback

Introduction

Background

Cooperation game

Results

Closing remarks

λ2 (encounters per second)

Optimal

erationwillingn

γ∗ ifori∈1,2,3

0 1 2 3 4

λ1 = 0.6

λ3 = 1.7

γ∗1

γ∗3

γ ∗2

λ2 < λ1 < λ3

γ∗2 > γ∗

1 > γ∗3

λ1 < λ2 < λ3

γ∗1 > γ∗

2 > γ∗3

λ1 < λ3 < λ2

γ∗1 > γ∗

3 > γ∗2

Simulation of AAV patrol scenario

Converges to predicted Nash equilibrium

Introduction

Background

Cooperation game

Results

Closing remarks

Optimal

erationwillingn

γ∗ ifori∈1,2,3

0 1 2 3 4

λ1 = 0.6

λ3 = 1.7

γ∗1

γ∗3

γ ∗2

λ2 < λ1 < λ3

γ∗2 > γ∗

1 > γ∗3

λ1 < λ2 < λ3

γ∗1 > γ∗

2 > γ∗3

λ1 < λ3 < λ2

γ∗1 > γ∗

3 > γ∗2

Increases in one encounter rate (e.g., λ2) cause equilibrium shift so

neighbors (e.g., 1 and 3) help more and agent (e.g., 2) helps less

Introduction

Background

Cooperation game

Results

Closing remarks

Optimal

erationwillingn

γ∗ ifori∈1,2,3

0 1 2 3 4

λ1 = 0.6

λ3 = 1.7

γ∗1

γ∗3

γ ∗2

λ2 < λ1 < λ3

γ∗2 > γ∗

1 > γ∗3

λ1 < λ2 < λ3

γ∗1 > γ∗

2 > γ∗3

λ1 < λ3 < λ2

γ∗1 > γ∗

3 > γ∗2

Increases in one encounter rate (e.g., λ2) cause equilibrium shift so

neighbors (e.g., 1 and 3) help more and agent (e.g., 2) helps less

Emergent cooperation due to cyclic feedback effects

Closing remarks

Introduction

Closing remarks

Both biology and engineering are full of interesting complex systems

Closing remarks

Introduction

Closing remarks

Real-time implementations in one domain are intuitive and

cognitively simple behaviors in another

Closing remarks

Introduction

Closing remarks

Homomorphisms are not always obvious and should not be

forced

Closing remarks

Introduction

Closing remarks

forced

Unifying principles are more valuable than mimicry

Closing remarks

Introduction

Closing remarks

forced

Catalyze interdisciplinary collaboration

Closing remarks

Introduction

Closing remarks

forced

Inject new ideas

Closing remarks

Introduction

Closing remarks

forced

Inject new ideas

Provides new avenues for careers after graduate school!

Thanks!

Introduction

Closing remarks

(bringing engineers and animals together)

Thank you!

Helpful People: Kevin Passino, Tom Waite, Ian Hamilton

Funding Sources:

Questions?

engineering serendipity: successes in solitary foraging ... · the biomimicry institute david...

Documents

contents...a faulty pgm-fi system is often related to poorly...

the water harvesting innovations - muonde trust...the water...

microtubules soften due to cross- sectional flattening ·...

ict4s keynote - frances brazier

clostridium brazier agar...

2009 grill & chill /brazier

sixth electoral district list 2015... · 98 bradshaw,...

best practices for software development in the … practices...

spotting faulty logic

4 th general session 2 february 2016 1. agenda welcome –...

hs171 troubleshooting guide 2 - thermopatch · 4. touch...

faulty logic/ · pdf filewhat is faulty reasoning and faulty...

faulty rsa encryption

faulty repair shop - segaarcade.com · faulty repair shop...

photo: david brazier/iwmi photo :tom van...

iv. faulty reasoning

william ackerman william brazier · 2017. 10. 27. ·...

jennifer kokoska kb boomer (advisor) richard brazier...

case_a faulty budget

faulty record