repeated auction games and learning dynamics in electronic logistics marketplaces: regulation...

64
Repeated Auction Games and Learning Dynamics Repeated Auction Games and Learning Dynamics in Electronic Logistics Marketplaces: in Electronic Logistics Marketplaces: Regulation through Information Regulation through Information Hani S. Mahmassani Hani S. Mahmassani University of Maryland University of Maryland Potentials of Complexity Science for Business, Governments and Potentials of Complexity Science for Business, Governments and the Media the Media Collegium Budapest, August 3-5, 2006 Collegium Budapest, August 3-5, 2006

Post on 20-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Repeated Auction Games and Learning Dynamics in Repeated Auction Games and Learning Dynamics in Electronic Logistics Marketplaces: Electronic Logistics Marketplaces: Regulation through Information Regulation through Information

Hani S. MahmassaniHani S. MahmassaniUniversity of MarylandUniversity of Maryland

Potentials of Complexity Science for Business, Potentials of Complexity Science for Business, Governments and the MediaGovernments and the Media

Collegium Budapest, August 3-5, 2006Collegium Budapest, August 3-5, 2006

Repeated Auction Games and Learning Dynamics in Repeated Auction Games and Learning Dynamics in Electronic Logistics Marketplaces: Electronic Logistics Marketplaces: Regulation through Information Regulation through Information

Hani S. MahmassaniHani S. MahmassaniUniversity of MarylandUniversity of Maryland

Potentials of Complexity Science for Business, Governments and the Potentials of Complexity Science for Business, Governments and the MediaMedia

Collegium Budapest, August 3-5, 2006Collegium Budapest, August 3-5, 2006

Once upon a time, there was a physical world…

Motivation …Motivation …

Developments in Information and Developments in Information and Communication Technologies are:Communication Technologies are: Transforming Supply Chain OperationsTransforming Supply Chain Operations Enhancing transportation service levels and Enhancing transportation service levels and

optimizing its performanceoptimizing its performance Introducing Introducing new ways of meeting supplynew ways of meeting supply

(capacity) (capacity) and demandand demand (shippers) (shippers) Freight MatchingFreight Matching Transportation AuctionsTransportation Auctions Supply Chain Integration ToolsSupply Chain Integration Tools

Repeated Auction Games and Learning Dynamics in Repeated Auction Games and Learning Dynamics in Electronic Logistics Marketplaces: Electronic Logistics Marketplaces:

Regulation through InformationRegulation through Information

Presentation OutlinePresentation Outline

1.1. MotivationMotivation

2.2. Characteristics of Transportation Characteristics of Transportation AuctionsAuctions

3.3. Problem DefinitionProblem Definition

4.4. MethodologyMethodology

5.5. ResultsResults

6.6. Ongoing and Future ResearchOngoing and Future Research

Collaborative research withCollaborative research with

Miguel Figliozzi Miguel Figliozzi (former PhD Student at Maryland, (former PhD Student at Maryland, now Asst. Professor at University of Sydney) now Asst. Professor at University of Sydney) Patrick Jaillet (MIT)Patrick Jaillet (MIT)

Vertical

IntegrationLong TermContracts

3PL Services

Private Exchanges

Spot MarketBrokers/Public

Exchange

+ Control, Collaboration, Reliability

Number of Participants +

Private Fleet Core Carriers Any Carrier/Shipper

+ Savings from CollaborationCustomized Services

+ Savings from Better RoutingEconomies of Scale/Scope

Range of Shipper/Carrier Procurement Structures

+ Long Term Relationships

Illustrative Benefits of Illustrative Benefits of

Market-based Procurement: Market-based Procurement:

Wealth GenerationWealth Generation

Comparing Transportation Market EnvironmentsComparing Transportation Market Environments

Vertical IntegrationVertical Integration Assignment to fleets:Assignment to fleets:

To own fleet To own fleet One shipment at a timeOne shipment at a time In real time In real time In order of arrivalIn order of arrival

Spot MarketSpot Market Assignment to fleets:Assignment to fleets:

To best bidderTo best bidder One shipment at a timeOne shipment at a time In real time In real time In order of arrivalIn order of arrival(zero probability of bidding (zero probability of bidding

on two shipments) on two shipments)

Common CharacteristicsCommon Characteristics Stochastic arrival of shipmentsStochastic arrival of shipments Hard Time WindowsHard Time Windows Simulate market underSimulate market under

Different arrival rates (low to high)Different arrival rates (low to high) Different Time Windows Widths (short to long)Different Time Windows Widths (short to long) Truth revealing second price auctionTruth revealing second price auction

Performance: Total Wealth Generated, Shipments Served, System Empty Distance

Vertical Integration vs. Spot MarketVertical Integration vs. Spot Market

4 shippers and 4 Carriers

Total Wealth Generated Change %

0%

5%

10%

15%

20%

25%

30%

35%

40%

45%

TW Short Med. Long Short Med. Long Short Med. Long

AR Low Med. High

Wealth Generated Change (% increase)

Vertical Integration vs. Spot MarketVertical Integration vs. Spot Market

Shipment Served Change

0

100

200

300

400

500

600

TW Short Med. Long Short Med. Long Short Med. Long

AR Low Med. High

SH

IPM

EN

TS S

ER

VE

D

(4 shippers and 4 Carriers– Shipment Served)

Shipment Served Change %

0%

5%

10%

15%

20%

25%

30%

TW Short Med. Long Short Med. Long Short Med. Long

AR Low Med. High

SH

IPM

EN

TS

SE

RV

ED

Vertical Integration vs. Spot MarketVertical Integration vs. Spot Market

Avg. Empty Distance Change

0.00

0.02

0.04

0.06

0.08

0.10

0.12

0.14

0.16

0.18

0.20

TW Short Med. Long Short Med. Long Short Med. Long

AR Low Med. High

DE

CR

EA

SE

(4 shippers and 4 Carriers – Empty Distance Change)

Avg. Empty Distance Change %

0%

5%

10%

15%

20%

25%

30%

35%

40%

45%

50%

TW Short Med. Long Short Med. Long Short Med. Long

AR Low Med. High

% D

EC

RE

AS

E

Dynamic Pricing in a Sequential Dynamic Pricing in a Sequential Auction Marketplace...Auction Marketplace...

The market generates a sequence of The market generates a sequence of auctionsauctions

Prices are generated as:Prices are generated as: The outcome of carrier bids The outcome of carrier bids Predefined set of rules (auction rules)Predefined set of rules (auction rules)

A Carrier’s behavior is expressed A Carrier’s behavior is expressed through his/her bidsthrough his/her bids

Auction Marketplace is a useful Auction Marketplace is a useful laboratory to gain insight into:laboratory to gain insight into:

Carrier behaviorCarrier behavior Learning and Adaptation Learning and Adaptation Effectiveness of Competitive strategiesEffectiveness of Competitive strategies Impact of Information Availability on Impact of Information Availability on

System PerformanceSystem Performance

Presentation OutlinePresentation Outline

1.1. MotivationMotivation2.2. Characteristics of Transportation Characteristics of Transportation

AuctionsAuctions3.3. Problem Definition Problem Definition 4.4. MethodologyMethodology5.5. ResultsResults6.6. Ongoing and Future ResearchOngoing and Future Research

What are the characteristics of What are the characteristics of transportation auctions?transportation auctions?

The traded entity is a serviceThe traded entity is a service Transportation services are perishable, Transportation services are perishable,

non-storable commoditiesnon-storable commodities Demand and supply are geographically Demand and supply are geographically

dispersed but they exchange real time dispersed but they exchange real time information onlineinformation online

What are the characteristics of What are the characteristics of transportation auctions?transportation auctions?

Group Effect:Group Effect: value of traded item (shipment) value of traded item (shipment) may be strongly dependent upon the acquisition may be strongly dependent upon the acquisition of other items (e.g. nearby shipments)of other items (e.g. nearby shipments)

Network Effect:Network Effect: value of a shipment is related to value of a shipment is related to the current spatial and temporal deployment of the current spatial and temporal deployment of the fleet.the fleet.

Uncertainty:Uncertainty: demand/supply over time and spacedemand/supply over time and space pricesprices

Sources of ComplexitySources of Complexity

1.1. Multiple interacting agents with multiple conflicting Multiple interacting agents with multiple conflicting objectivesobjectives

2.2. Modeling agents: bounded rationality and learningModeling agents: bounded rationality and learning

3.3. Demand: spatial and temporal stochasticityDemand: spatial and temporal stochasticity

4.4. Uncertainties about a shipment value and costUncertainties about a shipment value and cost

5.5. Fleet management complexities (routing, time Fleet management complexities (routing, time windows, penalties, etc.)windows, penalties, etc.)

6.6. New class of problems created by different New class of problems created by different market designs and levels of informationmarket designs and levels of information

Presentation OutlinePresentation Outline

1.1. MotivationMotivation2.2. Characteristics of Transportation AuctionsCharacteristics of Transportation Auctions3.3. Problem DefinitionProblem Definition4.4. Research MethodologyResearch Methodology 5.5. ResultsResults6.6. Ongoing and Future ResearchOngoing and Future Research

Problem ContextProblem Context

Sequential Auction Market EnvironmentSequential Auction Market Environment Stochastic arrival of non-identical shipmentsStochastic arrival of non-identical shipments Sequential auction of arriving shipmentsSequential auction of arriving shipments Bidding is done one shipment at a time, in order of Bidding is done one shipment at a time, in order of

arrivalarrival Carriers’ objective is to maximize expected profits Carriers’ objective is to maximize expected profits

while managing the fleet to satisfy service quality while managing the fleet to satisfy service quality constraints (time windows)constraints (time windows)

The principal operating costs are proportional to the The principal operating costs are proportional to the shipment haul-length and the empty distance required shipment haul-length and the empty distance required

Carrier costs and capacity affected by history of Carrier costs and capacity affected by history of bidding and shipment assignment decisionsbidding and shipment assignment decisions

Problem CategorizationProblem Categorization Two Layers of AllocationsTwo Layers of Allocations: :

Auction: Shippers Auction: Shippers Bidders Bidders Pricing of resourcesPricing of resources Profit Maximization problem Profit Maximization problem Strategic ProblemStrategic Problem

Fleet Management: Shipments Fleet Management: Shipments Trucks Trucks Allocation of own resourcesAllocation of own resources Cost Minimization problemCost Minimization problem Non-strategic problemNon-strategic problem

The joint bidding/fleet management The joint bidding/fleet management problem is highly complexproblem is highly complex

A sequential auction as a dynamic game of imperfect information

dynamic: carriers face each other at different stages

imperfect information: carriers are uncertain about competitors private information (what affects competitors’ shipment cost)

Stages are identified with shipment arrival epochs

A carrier has full knowledge about his fleet status (vehicles and shipments) and technology

A carrier has uncertainty about competitors’ fleet status, technology, or bidding function

Game FormulationGame Formulation

Finding a Finding a Bidding PolicyBidding Policy……

In auctions, profits are highly dependent In auctions, profits are highly dependent on the quality of the bidding policy. on the quality of the bidding policy.

A A bidding policybidding policy is a function produces a is a function produces a bid value using information about:bid value using information about:

the state of the carrier, the state of the carrier, the characteristics of the shipment for the characteristics of the shipment for

auction,auction, the marginal cost of serving the shipment, the marginal cost of serving the shipment, auction type, and beliefs about the auction type, and beliefs about the

competitors and environmentcompetitors and environment

Finding a Bidding Policy…Finding a Bidding Policy…

Problem complexity generally precludes Problem complexity generally precludes finding a policy that “optimizes” the entire finding a policy that “optimizes” the entire auction/assignment problemauction/assignment problem. .

Each auction provides opportunity for Each auction provides opportunity for carriers to carriers to learnlearn about about

The environment The environment Other players strategiesOther players strategies

Learning potential is dependent on Learning potential is dependent on informationinformation disclosed after each auction disclosed after each auction

Information LevelsInformation Levels The information revealed after each auction can The information revealed after each auction can

influence the nature and rate of the process by influence the nature and rate of the process by which carriers learn about the “game” and their which carriers learn about the “game” and their competitors’ behavior. competitors’ behavior.

Information includes: Information includes: Actions (bids) placed. Actions (bids) placed. Number of players (carriers) participatingNumber of players (carriers) participating Links (name) between carriers and bidsLinks (name) between carriers and bids Individual characteristics of carriers (e.g. fleet size)Individual characteristics of carriers (e.g. fleet size) Payoffs receivedPayoffs received Knowledge about who knows what, information Knowledge about who knows what, information

asymmetries, or shared knowledge about previous asymmetries, or shared knowledge about previous items.items.

Information LevelsInformation Levels

Define TWO information levelsDefine TWO information levels maximum information maximum information environment, all the above environment, all the above

information is revealed. information is revealed. minimumminimum information information environment where no environment where no

information is revealed. information is revealed.

These two extremes approximate two realistic These two extremes approximate two realistic situationssituations Maximum information would correspond to a real time Maximum information would correspond to a real time

internet auction where all auction information is internet auction where all auction information is accessed by participants. accessed by participants.

Minimum information would correspond to a shipper Minimum information would correspond to a shipper telephoning carriers for a quote, and calling back only telephoning carriers for a quote, and calling back only the selected carrier.the selected carrier.

Learning in Different Learning in Different EnvironmentsEnvironments

Minimum information settingMinimum information setting Genetic AlgorithmsGenetic Algorithms Reinforcement LearningReinforcement Learning

Maximum information settingMaximum information setting Fictitious PlayFictitious Play Machine LearningMachine Learning Rationalizable and Machine LearningRationalizable and Machine Learning Rule LearningRule Learning Rules of Thumb Learning (e.g. Tit for Tat)Rules of Thumb Learning (e.g. Tit for Tat)

Carriers’ DecisionsCarriers’ Decisions

Strategic decisionsStrategic decisions:: investment of resources, investment of resources, for the purpose of for the purpose of learninglearning about or about or influencing influencing competitorscompetitors, to improve profits by , to improve profits by manipulatingmanipulating future auction outcomes. future auction outcomes. IdentifyingIdentifying decisions decisions are characterized by attempts to are characterized by attempts to

identify or discover a competitor’s behavior. identify or discover a competitor’s behavior. SignalsSignals are decisions that aim to establish a are decisions that aim to establish a

reputation or status for the carrier. reputation or status for the carrier. Operating decisions:Operating decisions: decisions that are not decisions that are not

strategic but aim to improve a carrier’s profit strategic but aim to improve a carrier’s profit level (e.g. rerouting of the fleet after a successful level (e.g. rerouting of the fleet after a successful bid) bid)

Bounded RationalityBounded Rationality

Carriers can analyze with Carriers can analyze with different different degrees of sophisticationdegrees of sophistication (bounded (bounded rationality) the history of play and rationality) the history of play and estimate the possible future estimate the possible future consequences of current actions. consequences of current actions.

Research in the area of learning in Research in the area of learning in games is actively seeking to explain how games is actively seeking to explain how agents acquire, process, evaluate or agents acquire, process, evaluate or search for information.search for information.

Bounded RationalityBounded Rationality Cognitive and computational limitations can be Cognitive and computational limitations can be

evidenced in:evidenced in: Identification:Identification: the carrier has limited ability to the carrier has limited ability to

discover competitors’ behavioral types, which may discover competitors’ behavioral types, which may require complex econometric techniques;require complex econometric techniques;

Signaling:Signaling: limited ability to “read” or “send” signals limited ability to “read” or “send” signals that convey a reputationthat convey a reputation

Memory:Memory: limited ability to record and keep past limited ability to record and keep past outcome information or memory to simulate all outcome information or memory to simulate all future possible paths in the decision treefuture possible paths in the decision tree

Optimization:Optimization: even if carriers could identify even if carriers could identify competitors’ behavior, their ability to formulate and competitors’ behavior, their ability to formulate and solve stochastic optimization problems is likely solve stochastic optimization problems is likely limited.limited.

Presentation OutlinePresentation Outline

1.1. IntroductionIntroduction2.2. Characteristics of Transportation Characteristics of Transportation

AuctionsAuctions3.3. Problem DefinitionProblem Definition4.4. Research MethodologyResearch Methodology5.5. ResultsResults6.6. Ongoing and Future ResearchOngoing and Future Research

Case StudyCase Study: Myopic Carrier Learning in : Myopic Carrier Learning in Second Price AuctionsSecond Price Auctions

Study the impact of different learning Study the impact of different learning techniques on:techniques on: Carriers’Carriers’

Profits Profits Market shareMarket share

Under different market settingsUnder different market settings Minimum InformationMinimum Information Maximum InformationMaximum Information

Shippers’Shippers’ Consumer SurplusConsumer Surplus Number Shipments ServedNumber Shipments Served

Low Arrival RateLow Arrival Rate (uncongested)(uncongested)

High Arrival Rate High Arrival Rate (congested)(congested)

Research MethodologyResearch Methodology

Define Auction Type: Define Auction Type: Second Price AuctionSecond Price Auction

DEFINITION (DEFINITION (reverse auctionreverse auction)) Carrier with Carrier with lowest bid winslowest bid wins item item Winner gets paid second lowest bidWinner gets paid second lowest bid Rest of bidders do not pay or receive anythingRest of bidders do not pay or receive anything

PROPERTIES (one shot auction - Vickrey 1961)PROPERTIES (one shot auction - Vickrey 1961) Equilibrium strategies are truth-revealing and Equilibrium strategies are truth-revealing and

dominant strategies: bid true marginal costdominant strategies: bid true marginal cost They do not require gathering or analysis of They do not require gathering or analysis of

information about the competitors’ situationinformation about the competitors’ situation Leads to complete economic efficiency, the bidder Leads to complete economic efficiency, the bidder

with the lowest cost winswith the lowest cost wins

Problems with Second Price AuctionsProblems with Second Price Auctions Rothkopf, Teisberg, and Kahn (1990, JPE)Rothkopf, Teisberg, and Kahn (1990, JPE)

Auctioneer may cheat in the auctionAuctioneer may cheat in the auction Vulnerability to bidder collusionVulnerability to bidder collusion Revelation of private informationRevelation of private information

Sandholm (2000, IJEC)Sandholm (2000, IJEC) Complexity of bidding: looking at future sequence of Complexity of bidding: looking at future sequence of

arrivals introduce speculation about future bids of arrivals introduce speculation about future bids of other biddersother bidders

Untruthful bidding with risk averse bidders Untruthful bidding with risk averse bidders Marginal Cost BiddingMarginal Cost Bidding (Krishna, 2002) (Krishna, 2002)

Not necessarily Optimal in sequential AuctionsNot necessarily Optimal in sequential Auctions No known equilibrium for sequential auctions where No known equilibrium for sequential auctions where

bidders have multi-unit demand curvesbidders have multi-unit demand curves

Finding an “Optimal” policy...Finding an “Optimal” policy...

Limited to myopic biddingLimited to myopic bidding Will not consider impact of bidding on competitors’ Will not consider impact of bidding on competitors’

future behaviorfuture behavior Will not consider impact of bidding on future service Will not consider impact of bidding on future service

costscosts

Limited to finding “the best” constant marginal Limited to finding “the best” constant marginal cost factor cost factor cc such that: such that:

bid = bid = cc x marginal cost x marginal cost

Compare Learning Strategies: Compare Learning Strategies:

Reinforcement Learning Reinforcement Learning

Tit for TatTit for Tat

And And

Marginal Cost Bidding ( c = 1)Marginal Cost Bidding ( c = 1)

Finding an “Optimal” policy...Finding an “Optimal” policy...

Reinforcement LearningReinforcement Learning An An agent chooses an action with a probability that agent chooses an action with a probability that

is directly proportional to the profit that such action is directly proportional to the profit that such action has achieved in the pasthas achieved in the past

Initially the agent starts with positive uniform Initially the agent starts with positive uniform profits over all possible marginal cost factorsprofits over all possible marginal cost factors

As each action (bid) is played, the agent updates As each action (bid) is played, the agent updates the profit level with the payoff obtained the profit level with the payoff obtained

Over time, profit levels will converge (if facing a Over time, profit levels will converge (if facing a stationary environment)stationary environment)

This learning method can be utilized under This learning method can be utilized under minimum or maximum information settingsminimum or maximum information settings

go to referring slide

Reinforcement Learning (ctd.)Reinforcement Learning (ctd.)

A = {a1, ... ,an} set of available actions (bids)A = {a1, ... ,an} set of available actions (bids)

r(ai) : average reward obtained using action ai in r(ai) : average reward obtained using action ai in the past (includes both won and lost bids)the past (includes both won and lost bids)

P (ai) = probability of playing action aiP (ai) = probability of playing action ai

P (ai) = r(ai) / Σi r(ai) iP (ai) = r(ai) / Σi r(ai) iAA

go to referring slide

Tit for TatTit for Tat

Tit for Tat is more an “adaptive” rule of thumb than a Tit for Tat is more an “adaptive” rule of thumb than a learning mechanismlearning mechanism

It is a robust strategy in many strategic situationsIt is a robust strategy in many strategic situations This carrier roughly This carrier roughly imitates imitates what his opponent is doingwhat his opponent is doing With two carriers, A and T, where carrier T plays Tit for With two carriers, A and T, where carrier T plays Tit for

Tat, this rule of thumb can be defined as: Tat, this rule of thumb can be defined as: Carrier T computes the average bid value of carrier A over the Carrier T computes the average bid value of carrier A over the

last T auctions, called last T auctions, called ââ Carrier T computes his own marginal cost average over the last Carrier T computes his own marginal cost average over the last

T auctions, called T auctions, called ĉĉ Carrier T obtains:Carrier T obtains: αα = = ââ / / ĉĉ,, T’s next bid will be equal to: T’s next bid will be equal to: bid = bid = α x α x marginal cost marginal cost

Behavioral Assumptions and RulesBehavioral Assumptions and Rules

Carriers Carriers Non-cooperative carriersNon-cooperative carriers Preference over game outcomes with highest Preference over game outcomes with highest

expected return, risk neutralexpected return, risk neutral Myopic strategiesMyopic strategies

Shippers Shippers Shipper selects carrier with lowest bidShipper selects carrier with lowest bid Shipper does not cheatShipper does not cheat

Other Market SettingsOther Market Settings 2 Carriers2 Carriers Geographic Area : 1 * 1 square space Geographic Area : 1 * 1 square space Shipment Origin and Destination Shipment Origin and Destination Uniformly distributed Uniformly distributed

over spaceover space Earliest Pick Up Time = arrival timeEarliest Pick Up Time = arrival time Latest Pick Up Time = arrival time + Time window length Latest Pick Up Time = arrival time + Time window length

(2 units of time + uniform[0,2] )(2 units of time + uniform[0,2] ) Fleet size: 12 vehicles (constant) serving the marketFleet size: 12 vehicles (constant) serving the market The reservation price of the buyer (shipper) is distributed The reservation price of the buyer (shipper) is distributed

uniform [1.4,1.5]uniform [1.4,1.5] λλ= 0.25 arrivals/unit time/truck (not congested)= 0.25 arrivals/unit time/truck (not congested) λλ= 1.00 arrivals/unit time/truck (congested)= 1.00 arrivals/unit time/truck (congested) Results obtained with 10 iterations of 10,000 arrivals Results obtained with 10 iterations of 10,000 arrivals

eacheach

Presentation OutlinePresentation Outline

1.1. IntroductionIntroduction

2.2. Characteristics of Transportation Characteristics of Transportation AuctionsAuctions

3.3. Problem DefinitionProblem Definition

4.4. MethodologyMethodology

5.5. ResultsResults

6.6. Ongoing and Future ResearchOngoing and Future Research

Optimality of Bidding Marginal Cost Optimality of Bidding Marginal Cost with Low Arrival Rates (AR=3)with Low Arrival Rates (AR=3)

Carrier “MC” bids always marginal costCarrier “MC” bids always marginal cost Carrier “D” bids marginal cost multiplied by Carrier “D” bids marginal cost multiplied by cc Highest profits when c = 1 Highest profits when c = 1

Profits after Deviation from Marginal Cost

400450500550600650700750800850

0.5

0

0.6

0

0.7

0

0.8

0

0.9

0

1.0

0

1.1

0

1.2

0

1.3

0

1.4

0

1.5

0

Marginal Cost Factor c

Pro

fits

Deviating Carrier

Reinforcement LearningReinforcement Learning Carriers discover that Carriers discover that c = 1c = 1 provides the highest provides the highest

profits among all possible factorsprofits among all possible factors

Reinforcement Learning

3.0%

3.5%

4.0%

4.5%

5.0%

5.5%

0.5

0

0.6

0

0.7

0

0.8

0

0.9

0

1.0

0

1.1

0

1.2

0

1.3

0

1.4

0

1.5

0

Marginal Cost Factor

Pro

bab

ility

Deviating Carrier

Learning is not FreeLearning is not Free

Comparing Shipper Surplus and Rejected Shipments Comparing Shipper Surplus and Rejected Shipments between carriers playing Reinforcement Learning (RL) between carriers playing Reinforcement Learning (RL)

and carriers bidding Marginal Cost (MC)and carriers bidding Marginal Cost (MC)

RL vs MC Carries

0

100

200

300

400

500

600

3 12

Arrival Rate

Sh

ipm

ents

Rej

ecte

d RL

MC

RL vs MC Carriers

0

2000

4000

6000

8000

10000

12000

3 12

Arrival Rate

Co

ns

um

er

Su

rplu

s

RL

MC

Learning is not competitive...Learning is not competitive...

Comparing Shipments Won and Profits when a Comparing Shipments Won and Profits when a

RL carrier competes against a MC (c=1) carrierRL carrier competes against a MC (c=1) carrier

Reinf. Learn. vs. MC Carrier

0

1000

2000

3000

4000

5000

6000

7000

8000

3 12

Arrival Rate

Sh

ipm

ents

Wo

n

RL MC

Reinf. Learn. vs. MC Carrier

0

500

1000

1500

2000

2500

3000

3500

3 12

Arrival Rate

To

tal P

rofi

ts

RL MC

Tit for Tat is a Robust StrategyTit for Tat is a Robust Strategy

Maximum Information SettingMaximum Information Setting Tit for Tat competing with a RL and MC carriers (profits)Tit for Tat competing with a RL and MC carriers (profits) Tit for Tat successfully “imitates” competitorTit for Tat successfully “imitates” competitor

Tit for Tat vs Reinf. Learn.

0

500

1000

1500

2000

2500

3000

3 12

Arrival Rate

To

tal P

rofi

ts

TT RL

Tit for Tat vs. MC Carrier

0

500

1000

1500

2000

2500

3000

3 12

Arrival Rate

To

tal P

rofi

ts

TT MC

Too much information could be a problem...Too much information could be a problem...

If the “leader” becomes If the “leader” becomes aware aware of his leadership, it is a of his leadership, it is a dominating strategy to rise prices dominating strategy to rise prices

Graphs compare profits for MC carrier and Tit for Tat Graphs compare profits for MC carrier and Tit for Tat carrier when the leader goes from carrier when the leader goes from c=1c=1 to to c=2c=2

Leading Carrier

0

500

1000

1500

2000

2500

3000

3 12

Arrival Rate

To

tal P

rofi

ts

c = 1

c = 2

Tit for Tat Carrier

0

500

1000

1500

2000

2500

3000

3 12

Arrival Rate

To

tal P

rofi

ts

c = 1

c = 2

ConclusionsConclusions Even with minimum information, Even with minimum information, Learning Learning is is

possible (e.g. convergence towards mc bidding)possible (e.g. convergence towards mc bidding)

Learning is expensive for both carriers and Learning is expensive for both carriers and shippersshippers Carriers suffer hard against competitors that have Carriers suffer hard against competitors that have

already found “optimal” policiesalready found “optimal” policies Shippers Shippers do paydo pay for learning when all carriers are for learning when all carriers are

learninglearning Higher prices Higher prices Fewer shipments servedFewer shipments served

Therefore, market setting should be such that Therefore, market setting should be such that learning learning duration duration is minimizedis minimized

ConclusionsConclusions Maximum information settings allow a large Maximum information settings allow a large

array of possible new behaviorsarray of possible new behaviors Tit for Tat or “Tit for Tat or “imitationimitation” is possible, typical market ” is possible, typical market

with a “with a “leaderleader” and a “” and a “followerfollower”” From a carrier point of view, Tit for Tat is robustFrom a carrier point of view, Tit for Tat is robust From shippers point of view, Tit for Tat is “good” From shippers point of view, Tit for Tat is “good”

as long as the leader follows a competitive as long as the leader follows a competitive bidding policybidding policy

Problem with too much information: Problem with too much information: If the “leader” becomes If the “leader” becomes aware aware of his leadership, it is a of his leadership, it is a

dominating strategy to raise prices dominating strategy to raise prices The follower gladly follows suit since his profits also The follower gladly follows suit since his profits also

riserise

Incentive Compatibility Incentive Compatibility IssuesIssues

Truth revealing second price auction market Truth revealing second price auction market has a wealth creation potentialhas a wealth creation potential

Can a truth revealing market be sustained?Can a truth revealing market be sustained? Experiment: compare the performance of a Experiment: compare the performance of a

truth revealing carrier and a NON-truth truth revealing carrier and a NON-truth revealing carrier revealing carrier One carrier uses a bidding factor ≠1One carrier uses a bidding factor ≠1 The other carrier bids his marginal costThe other carrier bids his marginal cost

Incentive Compatibility

Ongoing and Future Research…Ongoing and Future Research…

Develop more sophisticated strategies that do take into Develop more sophisticated strategies that do take into account future consequences of current bid onaccount future consequences of current bid on

Player’s own future costsPlayer’s own future costs Player’s own future revenuesPlayer’s own future revenues

Develop strategies that use marketplace information to Develop strategies that use marketplace information to identifyidentify competitors behavior and manipulate future competitors behavior and manipulate future outcomesoutcomes

Study how market performance is affected by varying: Study how market performance is affected by varying: number of carriersnumber of carriers auction mechanismsauction mechanisms Information disclosedInformation disclosed

Develop discrete choice models of competitor behavior Develop discrete choice models of competitor behavior under different observation and information conditionsunder different observation and information conditions

QUESTIONS ?QUESTIONS ?

The Truck-Load Procurement Market (TLPM) formulation differs from other auction formulations in several respects:

(a) items auctioned (shipments) are multi-attributed(b) costs are functions of carriers’ status and

vehicle routing technologies (c) history and fleet management decisions affect

future cost probability distributions. (d) capacity constraints are linked to private

information and shipment characteristics(e) bidding strategies are dependent on public and

private history(f) timing of auctions is important (e) it is an online sequential auction.

Game FormulationGame Formulation

NotationNotation

ijb R

n i{1,2,..., }n (The set of carriers)

carriers competing, each carrier

the set of auction announcement epochs is 1 2{ , ,..., }Nt t t

1 2{ , ,..., }Ns s sthe set of arriving shipments is

jt represents the time when shipment js arrives

(set of shipments arriving after ) 1,..., 1{ ,..., }j N j NS s s

each carrier simultaneously bids a monetary amount

js

jsjy

0 1 2 1( , , ,..., )j jh h y y y

public information generated after auction for

information publicly known before bidding for shipment

NotationNotation

jt{ ,a ,c }i i i ij jz private information for each carrier at time

ijz carrier status before bidding (shipments + fleet)

1a ( , , )i i ij j j jz t h z assignment function

c ( , )i i ij j jc s z cost function

p( | , )i ij j jh

conditional probability about opponents’ type

js

1 1 1b {b ,..., b , b ,..., b }i i i n

b ( , , )i i ij j j jb s h bidding function

competitors’ bidding functions

NotationNotation

1q( ) { ,..., } [0,1]n nj j j jb q q q auction assignment function

1ij

i

q

probabilities of winning shipment

m( , )j j jm b q auction expected payment function

[ , ]i

j

i i i ij j j jm c s q expected profit for shipment

m q b b( , , , , , , , , [ , ])i

j

i i i i i i i i ij j j j j js h c s

js

( , , ) F P probability space of arrivals and shipment characteristics

Online EquilibriumOnline Equilibrium

Bidding Functions Equilibrium

,..,

* *b arg max m q b b

b B

( , ) ( , , , , , , , [ , ])

| , ,

i ij j

i

j N

i i i i i i i i ij j j j j j j

i ij j j

p s h c s

i s h

{ , }j j jt s

1

,.., 1 1

( ,..., ) 11

m q b b m q b b

m q b b

( , , , , , , , [ , ]) ( , , , , , , , [ , ])

[ ( , , , , , , , [ , ]) ]j N

i

j N

i i i i i i i i i i i i ij j j j j j j j j j

Ni i i i i i i

k k k k kk j

s h c s s h c s

E s h c s

Relaxing RationalityRelaxing Rationality

In a bounded rational model, a carrier faces two In a bounded rational model, a carrier faces two basic types of uncertainties regarding the basic types of uncertainties regarding the competition:competition: an uncertainty relative to the competitors’ private an uncertainty relative to the competitors’ private

informationinformation an uncertainty relative to the competitors’ bounded an uncertainty relative to the competitors’ bounded

rationality type or bidding functionrationality type or bidding function These uncertainties can be combined into a These uncertainties can be combined into a

“price” function“price” function Stationary caseStationary case

p( | )i ij j

b ( , , )i ij j js h

bf ( , , , )i ij jh

f ( , )jh

““Price” Problem FormulationPrice” Problem Formulation11stst price Auction price Auction

1,.., 1,..,

* *( )arg max [ ( ( , )) ( | 1) ( | 0) (1 ) ]

R

i i

j N j N

i i i i i i i i ij j j j j j j j j j jb E b c s z I s I I s I I

b

11,.., ( ,..., ) ( )1

( | 1) [ ( , , , | 1) ]]j N

i

j N

Ni i i i i

j j k k jk j

s I E E c s z I

11,.., ( ,..., ) ( )1

( | 0) [ ( , , , | 0) ]]j N

i

j N

Ni i i i i

j j k k jk j

s I E E c s z I

* *( ) ( ) 1( , , , )] ( , )) | ]i i i i i i i i

k k k k k k kE c s z E b c s z I b

1 0i i i ik k k kI if b and I if b

1a ( , , )i i ik k j kz t h z

““Price” Problem FormulationPrice” Problem Formulation22ndnd price Auctionprice Auction

1,.., 1,..,

*( )arg max [ ( ( , )) ( | 1) ( | 0) (1 ) ]

R

i i

j N j N

i i i i i i i ij j j j j j j j j jb E c s z I s I I s I I

b

11,.., ( ,..., ) ( )1

( | 1) [ ( , , , | 1) ]]j N

i

j N

Ni i i i i

j j k k jk j

s I E E c s z I

11,.., ( ,..., ) ( )1

( | 0) [ ( , , , | 0) ]]j N

i

j N

Ni i i i i

j j k k jk j

s I E E c s z I

*( ) ( ) 1( , , , )] ( , )) | ]i i i i i i i

k k k k k kE c s z E c s z I b

1 0i i i ik k k kI if b and I if b

1a ( , , )i i ik k j kz t h z return