approximating game-theoretic optimal strategies for full scale poker (darse billings ++ 2003 …)
Post on 12-Jan-2016
42 Views
Preview:
DESCRIPTION
TRANSCRIPT
Approximating Game-Theoretic Optimal Strategies for Full Scale Poker
(Darse Billings++ 2003 …)
Presented by Brett Borghetti
21 Jan 2007
21 Feb 2007 Brett Borghetti 2
Contributions of the work:
• Reduced 2 player Hold’em gamespace O(1018) using approximations to O(107)
• Built a new pokerbot capable of competing with world-class human opponents
• [Brett says] Developed a solution for mixed strategy equilibrium in a ‘model’ (approximation) of full hold’em poker
21 Feb 2007 Brett Borghetti 3
Interesting Experiments
• Played fairly well against world class player (Gautam Rao)• Although ‘thecount’ won the match, statistically the
outcome of this match does not indicate which player is better overall
21 Feb 2007 Brett Borghetti 4
Interesting Experiments
21 Feb 2007 Brett Borghetti 5
Interesting Experiments
21 Feb 2007 Brett Borghetti 6
Approach for reducing the gamespace
• Betting Round Reduction (actions per round)
• Elimination of Betting Rounds (rounds per hand)
• Splitting the hand into multiple abstract subgames
• Bucketing of (approximate) equivalence classes of cards
21 Feb 2007 Brett Borghetti 7
Betting Round Reduction
• Normally, up to 4 legitimate raises are allowed in 2 player Hold’em
• Reduction allowed only 3 legitimate raises to be considered
• Reduces branching factor from 9 to 7• Experiments showed that this reduction did not
significantly reduce EV or perturb strategy• Reducing to 2 legitimate raises did perturb EV and
strategy significantly
21 Feb 2007 Brett Borghetti 8
Elimination of Betting Rounds
• Explored truncation (treating Hold’em as a n-round game instead of a 4 round game)– Used EV rollouts for the remaining rounds (assumed all
players checked or called in the truncated rounds)– Explored truncating early rounds and later rounds
• Combined several truncations– PsOpti1 uses 1-round pre-flop model plus a post-flop
model– PsOpti2 uses 2 overlapping 3 round models (pre-flop
through turn and flop through river)
21 Feb 2007 Brett Borghetti 9
Integration of Truncated Models
21 Feb 2007 Brett Borghetti 10
Bucketing
• Trying to reduce cardspace via equivalence classes with respect to how to bet and how much the cards are worth (EV)
• Built a 2-d graph (Hand Strength vs Hand Potential)• Choose N ‘buckets’ (the number of clusters to break up the
neighborhoods in the graph)• Explored performance different values for N & chose
– N-1 buckets of varying hand strength-low potential cards– 1 bucket for low hand strength-high potential cards
• Used transition probabilities to give likelihoods of transitioning between one bucket and another after revealing the next card(s) on the board
21 Feb 2007 Brett Borghetti 11
Psuedo-Optimal Play
• With the approximated game tree, they used a powerful LP solver (CPLEX with the Barrier method & 2GB ram) to determine the solution to the linear equations for equilibrium play– Calculation took ‘less than a day’ of computing
• Produced a large lookup table of probability triples for each bucket in each possible condition <P(fold),P(call),P(raise)> which sum to 1
• Play a mixed strategy by randomly choosing one action according to the distribution.
21 Feb 2007 Brett Borghetti 12
Issues [Brett]
• Only works with 2 players. (future work claims they will develop an N-player version also)
• Does not contain an explicit opponent model that attempts to exploit its current opponent
21 Feb 2007 Brett Borghetti 13
Background Information
21 Feb 2007 Brett Borghetti 14
Texas Hold’em Heads-up Limit Poker Basics
• 2 Players• 4 Betting Rounds per hand
– Preflop(2 hole cards), Flop(3 community cards), Turn (1cc), River (1 cc)
• Action set = {fold, call(check), raise(bet)}• Up to 3 raises allowed per round• Round is over when either
– When all players are even in the pot via a final call and each player has had at least one opportunity to act [go on to next round]
– When one player folds [other player wins]
21 Feb 2007 Brett Borghetti 15
Requirements for a World Class Poker Player
• Able to assess– Hand Strength
– Hand Potential
– Opponents Betting Strategy (opponent model)
• Has a strong– Betting strategy
– Ability to play deceptively [bluff vs. slow play*]
– Ability to play unpredictably
21 Feb 2007 Brett Borghetti 16
Optimal vs Maximal play
• Optimal player makes decisions based on game-theoretic probabilities without regard to specific context (opponent’s plays)
• Maximal player takes into account the opponent’s sub-optimal tendencies and adjusts its play to exploit perceived weaknesses
21 Feb 2007 Brett Borghetti 17
Hand Assessment (Hand Strength = HS)
• Pre-Flop HS determined from 169 equivalence classes “income rate” from 1M simulated poker hands
• Flop HS determined comparing each of the 1081 possible opponent hands with ours and determining how many wins each player has
21 Feb 2007 Brett Borghetti 18
Hand Potential (HP) at the Flop
• PPot1 = likelihood that our hand will improve with one card (the turn card)
• PPot2 = likelihood that our hand will improve with two cards (turn and river)
• NPot1 and 2 = equivalent calculations of likelihood that our opponent’s hand will get better than ours on the turn and/or river
21 Feb 2007 Brett Borghetti 19
Effective Hand Strength & Pot Odds
• EHS = HSn + (1-HSn) x Ppotn
– The chance that we either are ahead or could pull ahead by the end of n=1 or n=2 cards from now
• Pot odds = P(win)/(Expected Return on Pot)– Example: if your chance of winning is 25%, you would
call a $4 bet to win a $16 pot because your earnings are 0.25*$20 = $5 and hence you can expect to win $5 every time you pay $4 for an expected net gain of $1.00 per play.
top related