dna starts to learn poker david harlan wood 4 * hong bi 1 steven o. kimbrough 2 dongjun wu 3

32
DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3 Junghuei Chen 1* Departments of 1 Chemistry & Biochemistry and 4 Computer & Information Sciences University of Delaware 2 The Wharton School, University of Pennsylvania 3 Benett S. Lebow College of Business, Drexel University

Upload: bary

Post on 06-Jan-2016

24 views

Category:

Documents


0 download

DESCRIPTION

DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3 Junghuei Chen 1* Departments of 1 Chemistry & Biochemistry and 4 Computer & Information Sciences University of Delaware 2 The Wharton School, University of Pennsylvania - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

DNA Starts to Learn Poker

 David Harlan Wood4*

Hong Bi1

Steven O. Kimbrough2

Dongjun Wu3

Junghuei Chen1*

 

Departments of 1Chemistry & Biochemistry and 4Computer & Information Sciences

University of Delaware

2The Wharton School, University of Pennsylvania

3Benett S. Lebow College of Business, Drexel University

 

Page 2: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

Player Dealt an Ace

Ace

Say Ace(adds $1)

Player

Dealer Call(adds $1)

Fold

Losses $ 1

Deal

Loses $2

Page 3: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

2

Say Ace (adds $1)

Say 2 Player

Dealer

Losses $ 1

Call(adds $1)

Fold

Losses $ 1

Wins $ 2

Deal

Player dealt a 2

Page 4: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

Ace 2

Say Ace(adds $1)

Say Ace (adds $1)

Say 2 Player

Dealer Call(adds $1)

Fold

Losses $ 1

Losses $ 1

Call(adds $1)

Fold

Losses $ 1

Wins $ 2

Deal

Player dealt an Ace Player dealt a 2

Loses $2

OBJECTIVE: To Obtain Probabilistic Strategies

Each player wants to obtain a strategy for the game.

A strategy prescribes an action in every possible situation.

That is, at each node, raising as a function of hand dealt.

Page 5: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

Poker

Play New Game

New DealerStrategies

Deals

Assemble

New PlayerStrategies

Page 6: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

Learning

Separate by Payoffs

ProgrammableSelection of Recovered Dealer Strategies

ProgrammableSelection of Recovered Player Strategies

Dealer’s Adaptation

Player’s Adaptation

Amplify

Crossover

Mutate

Amplify

Crossover

Mutate

Recover & DistributeStrategies

Recover & Cut Play Histories forPlayer’s & Dealer’s Strategies

Player’s StrategiesDealer’s Strategies

Page 7: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

Learning Poker

Play New GameSeparate by Payoffs

ProgrammableSelection of Recovered Dealer Strategies

ProgrammableSelection of Recovered Player Strategies

Dealer’s Adaptation

Player’s Adaptation

Amplify

Crossover

Mutate

New DealerStrategies

Amplify

Crossover

Mutate

Deals

Assemble

New PlayerStrategies

Recover & DistributeStrategies

Recover & Cut Play Histories forPlayer’s & Dealer’s Strategies

Player’s StrategiesDealer’s Strategies

Page 8: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

R.E. 1

Dealer’s Strategies

R.E. 2

Stopper Stopper

Say A’ FOLD’Call’Fold’

Player’s Strategies

R. E. 1

2’Say 2’ Fold’ErrorSAY2’ Say A’ A’Say A’

StopperStopper Stopper

2

Dealt 2

R.E. 2

A

R.E. 2

Ace 2

Say Ace(adds $1)

Say Ace (adds $1)

Say 2 Player

Dealer Call(adds $1)

Fold

Losses $ 1

Losses $ 1

Call(adds $1)

Fold

Losses $ 1

Wins $ 2

Deal

Loses $2

Sequences from: Sakamoto, et. al, DNA4 (1997)

Dealt A

Page 9: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

Dealer’s Strategies

Page 10: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

Player’s Strategies

Page 11: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

Deals

Page 12: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

Two Strategies and a Deal Define a Game

Ace Dealt

A Player’s Strategy

R. E. 1

2’Say 2’ Fold’ErrorSAY2’ Say A’ A’Say A’

A Dealer’s Strategy

R.E. 1 R.E. 2

Say A’ FOLD’Call’Fold’

A

R.E. 2

Page 13: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

Cut with R.E.1 & R.E.2 and Assemble A Game

Player’s Strategy Dealer’s Strategy Deal

2’Say 2’ Fold’ErrorSay A’ A’Say A’ Say A’ Call’Fold’ ASAY 2’ FOLD’

2’Say 2’ Fold’ErrorSay A’ A’Say A’

R. E. 1

Say A’ Call’Fold’

R.E. 2

A

SAY 2’

FOLD’

Page 14: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

Cut with R.E.1 & R.E.2 and Assemble A Game

Player’s Strategy Dealer’s Strategy Deal

2’Say 2’ Fold’ErrorSay A’ A’Say A’ Say A’ Call’Fold’ ASAY 2’ FOLD’

2’Say 2’ Fold’ErrorSay A’ A’Say A’

R. E. 1

Say A’ Call’Fold’

R.E. 2

A

SAY 2’

FOLD’

Two Strategies and a Deal Define a Game

Ace Dealt

A Player’s Strategy

R. E. 1

2’Say 2’ Fold’ErrorSAY2’ Say A’ A’Say A’

A Dealer’s Strategy

R.E. 1 R.E. 2

Say A’ FOLD’Call’Fold’

A

R.E. 2

Page 15: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

Player’s Strategy Dealer’s Strategy Deal

2’Say 2’ Fold’ErrorSay A’ A’Say A’ Say A’ Call’Fold’ ASAY 2’ FOLD’

74-mer (S1) 57-mer (S2) 48-mer (S3) 53-mer (S4)

L1 (25 mer) L3 (28 mer)L2 (28 mer)

S1 S2 S3 S4 R1 R2 M

R1: Ligation Reaction R2: Purified Ligation Product

50

75

100

150

200225232

Page 16: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

Ace

Say Ace(adds $1)

Say 2 Player

Dealer Call(adds $1)

Fold

Losses $ 1

Deal

Player dealt an Ace

Player Says A

Dealer Folds

Dealer MIGHT Change to Call

Loses $2

Page 17: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

Player Dealt an Ace

2’Say 2’ Fold’ErrorSAY 2’Say A’ A’Say A’ Say A’ FOLD’Call’Fold’ A

Player’s Strategy Dealer’s Strategy Deal

Player Says Ace

A’Say A’

Extend(Say A) A

Player’s Strategy

Extend(Fold)

Say A’Fold’

Say ADealer Folds

Dealer’s Strategy

Extend(Call)

Dealer MIGHT Change to Call

Fold’ FOLD’Call’

FoldPreventer

Dealer’s Strategy

Error

Page 18: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

Player Says Ace

A’Say A’

Extend(Say A) A

Extend(Fold)

Say A’Fold’

Say ADealer Fold

Extend(Call)

Dealer MIGHT Change to Call

Fold’ FOLD’Call’

FoldPreventer

200

225

250

275300

(232-mer)

(247-mer)

(262-mer)

(282-mer)

Page 19: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

2

Say Ace (adds $1)

Say 2 Player

Dealer

Losses $ 1

Call(adds $1)

Fold

Losses $ 1

Wins $ 2

Deal

Player dealt a 2

Player Says 2

(Block Say 2)

Player Changes to Say A

Dealer Changes to Call

Dealer Folds

Page 20: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

Player Dealt a 2

22’Say 2’ Fold’Error SAY 2’Say A’ A’Say A’ Say A’ FOLD’Call’Fold’

Player’s Strategy Dealer’s Strategy Deal

Dealer MIGHT Change to Call

FOLD’Call’

FoldExtend(Call)

Fold’Error

Preventer

Dealer’s Strategy

Dealer FoldsExtend(Fold)

Say A’Fold’

Say A

Dealer’s Strategy

Player MIGHT Change to Say Ace

Player’s Strategy

SAY 2’Say A’

Extend(Say A) Say 2

Player Says 2

Say 2’ 2’

Extend(Say 2) 2

Player’s Strategy

Page 21: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

Ace 2

Say Ace(adds $1)

Say Ace (adds $1)

Say 2 Player

Dealer Call(adds $1)

Fold

Losses $ 1

Losses $ 1

Call(adds $1)

Fold

Losses $ 1

Wins $ 2

Deal

Player dealt an Ace Player dealt a 2

Player Says A

Dealer Folds

Dealer MIGHT Change to Call

Loses $2

Dealer MIGHT Change to Call

Dealer Folds

Player MIGHT Change to Say Ace

Player Says 2

Page 22: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

Learning Poker

Play New GameSeparate by Payoffs

ProgrammableSelection of Recovered Dealer Strategies

ProgrammableSelection of Recovered Player Strategies

Dealer’s Adaptation

Player’s Adaptation

Amplify

Crossover

Mutate

New DealerStrategies

Amplify

Crossover

Mutate

Deals

Assemble

New PlayerStrategies

Recover & DistributeStrategies

Recover & Cut Play Histories forPlayer’s & Dealer’s Strategies

Player’s StrategiesDealer’s Strategies

Page 23: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

Separate by Payoffs

ProgrammableSelection of Recovered Dealer Strategies

Dealer’s Adaptation

Amplify

Crossover

Mutate

Recover & DistributeStrategies

Recover & Cut Play Histories forPlayer’s & Dealer’s Strategies

Player’s StrategiesDealer’s Strategies

Strategies are returnedgrouped by outcomes:-$ 2, - $ 1, + $ 1, + $ 2.

Select Dealer’s ownPreferred mix of strategies to be bred

Breed by using PCR to restore population size using a variablemutation rate.

Crossover by pairwise recombiningof “change your mind” regions.

Learning

Page 24: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

Ace 2

Say Ace(adds $1)

Say Ace (adds $1)

Say 2 Player

Dealer Call(adds $1)

Fold

Losses $ 1

Losses $ 1

Call(adds $1)

Fold

Losses $ 1

Wins $ 2

Deal

Player dealt an Ace Player dealt a 2

Loses $2

OBJECTIVE: To Obtain Probabilistic Strategies

Each player wants to obtain a strategy for the game.

A strategy prescribes an action in every possible situation.

That is, at each node, raising as a function of hand dealt.

Page 25: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

Complexity

Our complexity is linear in the number of nodes in the tree # nodes in tree = 2 players + betting rounds

At each node, we need a probability distribution giving “level of bet” as a function of “dealt hand”.

For us, probability distribution is substituted by probabilistichybridization of DNA encoded “dealt hand” to adapting“change you mind about folding” region of strategy.

The output (if generated) is an adapting “level of bet”region of strategy.

handbetnext

next’

bet generator

next

Extend

bet’ hand’

hand evaluator

Page 26: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

Comparison

Koller and Pfeffer derive equilibrium mixed strategies withcomplexity polynomial in

# nodes * # possible deals * 2 betting levels

“Representations and Solutions for Game-Theoretic Problems,”Artificial Intelligence (1997)

• Two-player games only• Don’t exploit weakness of opponent• No dynamics, only equilibrium

Page 27: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

Player 1

Player 2

Player 3 22

222

2

2

22

222

3-Player Poker: All Possible Deals

Course of Play

P1

P2

P3

P3

P2

P1

Pass Bet $ a

Pass

Pass Bet $ a

Bet $ a

F C

F C F CF C

F C

F C F C

F C F C

C: Call (add $ b) F: Fold

Page 28: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

Learning Poker

Recover Dealer’s & Player’s Strategies

Play New GameSeparate by Payoffs

ProgrammableSelection of Recovered Dealer Strategies

ProgrammableSelection of Recovered Player Strategies

Dealer’s Adaptation

Player’s Adaptation

Amplify

Crossover

Mutate

New DealerStrategies

Amplify

Crossover

Mutate

Deals

Assemble

New PlayerStrategies

Page 29: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

A

AA

A

AA

2

22

2

22

2

2

22

A

A

Page 30: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

A

AA

A

AA

2

22

2

22

2

2A

A

AA

Page 31: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

3

33

3

33

3

33

3

33

3

33

3

33

Page 32: DNA Starts to Learn Poker David Harlan Wood 4 * Hong Bi 1 Steven O. Kimbrough 2 Dongjun Wu 3

Dealer MIGHT Change to Call

Dealer Folds

Player MIGHT Change to Say Ace

Player Says 2