how computers play games with you cs161, spring ‘03 nathan sturtevant

51
How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Post on 21-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

How computers play games with you

CS161, Spring ‘03

Nathan Sturtevant

Page 2: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Outline

Historic Examples Classes of Games Algorithms

Minimax - pruning

Other techniques Multi-Player Games

Page 3: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Successful Game Programs

Checkers Chinook

1992 Tinsley won 40-game match, 4-2-34 1994 Tinsley withdrew due to health reasons 444 billion move end-game database

Chess Kasparov is currently the best human 1997 Deep Blue won exhibition match 2-1-3 2003 Deep Junior played to a draw

Page 4: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Game Programs (continued)

Othello (Reversi) 1997, Logistello beat Murakami 6-0 (264/120)

Scrabble Maven

1998 played Adam Logan, won 9-5 Came back from down 98 to win with MOUTHPART

Awari (Mancala) Solved in 2002 - draw http://awari.cs.vu.nl/

Page 5: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Overview - Types of Games

Single-Agent Search 1 player v. a difficult problem Defined by:

Start state Successor function Heuristic function Goal test

Page 6: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Overview - Types of Games

Game Search (Adversary Search) Defined by:

Initial State Successor function Terminal Test Utility / payoff function

Similar to heuristics in single agent problems

Page 7: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Chinese Checkers

Based on European game Halma

Americans called it Chinese Checkers 1 player game? 2 player game? Multi-player game?

Page 8: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Classes of Games

Deterministic v. Non-deterministic Chess v. Backgammon

Perfect Information v. Imperfect information Checkers v. Bridge

Zero-sum (strictly competitive) Prisoners dilemna

Non-zero sum

Page 9: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Classes of Games

Deterministic Chance

Perfect information

Imperfect information

Page 10: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Classes of Games

Deterministic Chance

Perfect information

Chess, checkers, go, othello, chinese checkers

Imperfect information

Page 11: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Classes of Games

Deterministic Chance

Perfect information

Chess, checkers, go, othello, chinese checkers

Backgammon, monopoly, risk

Imperfect information

Page 12: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Classes of Games

Deterministic Chance

Perfect information

Chess, checkers, go, othello, chinese checkers

Backgammon, monopoly, risk

Imperfect information

Stratego

Page 13: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Classes of Games

Deterministic Chance

Perfect information

Chess, checkers, go, othello, chinese checkers

Backgammon, monopoly, risk

Imperfect information

Stratego Bridge, poker, scrabble, (real life)

Page 14: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

How do we simulate games?

Build a game tree Start state at the root All possible moves as children

Page 15: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Tic-Tac-Toe Me

Opponent

Page 16: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

How do we choose our move?

Apply utility function at the leaves of the tree In tic-tac-toe, count how many rows and columns are

occupied by each player and subtract

Page 17: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Tic-Tac-Toe Me

Opponent

x: 2r 2c 2d

o: 2r 1c 1d

Utility = 2 Utility = 3 Utility = ∞ Utility = 2

x: 2r 3c 2d

o: 2r 1c 1d

x: 3r 3c 2d

o: 2r 2c 0d

x: 2r 2c 2d

o: 2r 2c 0dUtility = 3 Utility = ∞

Utility = 3

Page 18: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

What is our algorithm?

Apply utility function at the leaves of the tree In tic-tac-toe, count how many rows and columns are

occupied by each player and subtract Back-up values in the tree

This calculates the “minimax” value of a tree

Page 19: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Minimax Maximizer

Minimizer

2 3 ∞ 2

3

3

1 - ply

1 - ply

Minimizers strategy

Page 20: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Minimax - Properties

Complete? Yes - if tree is finite

Optimal? Yes - against an optimal opponent

Time Complexity? O(bd)

Space Complexity? O(bd)

Page 21: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Minimax

Assume our computer can expand 105 nodes/sec Assume we have 100 seconds to move 107 nodes/move Tic-tac-toe

9! = 362880 (naïve) ways to play a game (b=4) 39 = 19683 possible states (upper bound) on a board

Chess b = 35, d = 100, must search 2154 nodes

Page 22: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Minimax - issues

Evaluation function Where does it come from?

Expert knowledge Chess: material value Othello (reversi): positional strength

Learned information Pre-computed tables

Quiescence

Page 23: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

quiescence

Page 24: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Minimax - issues

Quiescence We don’t see the consequences of our bad choices quiescence search

Horizon problem We avoid dealing with a bad situation

Page 25: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Minimax

In Chess b = 35 107 nodes/move Can search 4-ply into tree (human novice) Good humans can search 8-ply Kasparov searches about 12-ply

What to do? - pruning

Page 26: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Minimax Maximizer

Minimizer

2 3 ∞ 2

3

3

1 - ply

1 - ply

Minimizers strategy

Page 27: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

- pruning

= lower bound on Maximizer’s score Start at -∞

= upper bound on Minimizer’s score Start at ∞

Page 28: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Maximizer

Minimizer

1

-∞ ∞

= -∞

= ∞

= -∞

= ∞

= -∞

= ∞ ≥1

Page 29: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Maximizer

Minimizer

1 2

2

-∞ ∞

= -∞

= ∞

= -∞

= ∞

= 1

= ∞ ≥1

Page 30: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Maximizer

Minimizer

1 2

2

-∞ ∞

= -∞

= ∞

= -∞

= ∞

= 2

= ∞

Page 31: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Maximizer

Minimizer

1 2 3

2

-∞ ∞

= -∞

= ∞

≥ 3

= -∞

= 2

= -∞

= 2

≤ 2

Page 32: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Maximizer

Minimizer

1 2 3

2

2

-∞ ∞

= -∞

= ∞

≥ 3

≥ 2

= -∞

= 2

= 3

= 2

Page 33: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Maximizer

Minimizer

1 2 3

2

2

5

≥ 5

-∞ ∞

= 2

= ∞

≥ 3

≥ 2

= 2

= ∞

= 2

= ∞

Page 34: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Maximizer

Minimizer

1 2 3

2

2

5 6

6

-∞ ∞

= 2

= ∞

≥ 3

≥ 2

= 2

= ∞

= 5

= ∞

≥ 5

Page 35: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Maximizer

Minimizer

1 2 3

2

2

5 6 7

6

-∞ ∞

= 2

= ∞

≥ 3

≥ 2

= 2

= 6 ≤ 6

≥7

= 2

= 6

Page 36: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Maximizer

Minimizer

1 2 3

2

2

5 6 7

6

-∞ ∞

= 2

= ∞

≥ 3

≥ 2

= 2

= 6 ≤ 6

≥7

= 7

= 6

6

6

Page 37: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

- pruning

Complete? Yes - if tree is finite

Optimal? Computes same value as minimax

Time Complexity? Best case O(bd/2) Average case O(b3d/4)

Page 38: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

- pruning

Effectiveness depends on order of moves in tree In practice, we can usually get best-case performance

Chess Before we could search 4-ply into tree Now we can search 8-ply into tree

Page 39: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Other Techniques

Transposition tables Opening / Closing book

Page 40: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Transposition Tables

Only using linear about of memory Search only takes a few kb of memory

Most games aren’t trees but graphs

Page 41: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Transposition Tables

Page 42: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Transposition Tables

A lot of duplicated effort Transposition tables hash game states into table

Store saved minimax value in table

Pre-compute & store values Opening book Closing book

Page 43: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Multi-Player Games

2-Player game trees have a single minimax value Games with ≥ 2 players use a n-tuple of scores

ie (3, 2, 5)

The sum of values in every tuple should be constant

Page 44: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Maxn

1

3

(7, 3, 0)

3

(3, 2, 5)

(7, 3, 0) (0, 10, 0) (1, 4, 5)

(7, 3, 0)

3

(0, 10, 0)

3

(4, 2, 4)

22 2

3

(1, 4, 5)

3

(4, 3, 3)

(3 Players)

Page 45: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Can we prune maxn trees

In minimax we bound the game tree value In maxn we bound based on sum of values

All scores sum to 10 If Player 1 gets 7 points… Player 2-3 will get ≤ 3 points

Page 46: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Shallow Maxn Pruning

1

3

(7, 3, 0)

3

(3, 2, 5)

(7, 3, 0) (0, 10, 0) (≤6, ≥4, ≤6)

(7, 3, 0)

3

(0, 10, 0)

22 2

3

(1, 4, 5)

(3 Players)

(≥7, ≤3, ≤3)

Page 47: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Shallow Maxn Pruning

Complete? Yes

Optimal? Yes*

Time Complexity? Best-case**: bd/2

Average-case: bd

Space Complexity? b•d

Page 48: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Maxn Pruning

Why is maxn weak in practice? Only compares 2 scores out of n players Relies on game evaluation properties, not ordering

Last-Branch Pruning Speculative Pruning

Page 49: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Last-Branch/Speculative Pruning

1

(3, 3, 4)

(3, 3, 4) 2

(3 Players)

2

33(1, 4, 5)

1(2, 4, 4)

2

Page 50: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Last Branch/Spec. Pruning

Best case: O(bd·(n-1)/n) As b gets large Dependent only on node ordering in tree http://www.cs.ucla.edu/~nathanst/ for more info

Page 51: How computers play games with you CS161, Spring ‘03 Nathan Sturtevant

Imperfect Information

Most card games have imperfect information We can use monte-carlo simulation

Create many consistent samples of possible opponent hands

Solve using perfect-information methods Combine results together to make next move