game playing - university of rochesterjliu/csc-242/games.pdf · slide 4 two-player zero-sum...

59
slide 1 Game Playing Lecturer: Ji Liu Thank Jerry Zhu for his slides [based on slides from A. Moore http://www.cs.cmu.edu/~awm/tutorials , C. Dyer, and J. Skrentny]

Upload: others

Post on 13-Jun-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 1

Game Playing

Lecturer: Ji LiuThank Jerry Zhu for his slides

[based on slides from A. Moore http://www.cs.cmu.edu/~awm/tutorials , C. Dyer, and J. Skrentny]

Page 2: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 2

Sadly, not these games…

I am here

Page 3: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 3

Overview

• two-player zero-sum discrete finite deterministic game of perfect information

• Minimax search

• Alpha-beta pruning

• Large games

• two-player zero-sum discrete finite NON-deterministic game of perfect information

Page 4: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 4

Two-player zero-sum discrete finite deterministic games of perfect information

Definitions:

• Zero-sum: one player’s gain is the other player’s loss. Does not mean fair.

• Discrete: states and decisions have discrete values

• Finite: finite number of states and decisions

• Deterministic: no coin flips, die rolls – no chance

• Perfect information: each player can see the complete game state. No simultaneous decisions.

Page 5: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 5

Which of these are: Two-player zero-sum discrete finite deterministic games of perfect information?

[Shamelessly copied from Andrew Moore]

Zero-sum: one player’s gain is the other player’s loss. Does not mean fair.

Discrete: states and decisions have discrete values

Finite: finite number of states and decisions

Deterministic: no coin flips, die rolls – no chance

Perfect information: each player can see the complete game state. No simultaneous decisions.

Page 6: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 6

Which of these are: Two-player zero-sum discrete finite deterministic games of perfect information?

[Shamelessly copied from Andrew Moore]

Zero-sum: one player’s gain is the other player’s loss. Does not mean fair.

Discrete: states and decisions have discrete values

Finite: finite number of states and decisions

Deterministic: no coin flips, die rolls – no chance

Perfect information: each player can see the complete game state. No simultaneous decisions.

Not finite

Multiplayer

One player

Stochastic

Hidden

Information

Involves Improbable

Animal Behavior

Page 7: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 7

II-Nim: Max simple game

• There are 2 piles of sticks. Each pile has 2 sticks.

• Each player takes one or more sticks from one pile.

• The player who takes the last stick loses.

(ii, ii)

Page 8: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 8

The game tree for II-Nim

(ii ii) Max

(i ii) Min (- ii) Min

(i i) Max (- ii) Max (- i) Max (- i) Max (- -) Max

+1

(- i) Min (- -) Min

-1(- i) Min (- -) Min

-1(- -) Min

-1

(- -) Max

+1(- -) Max

+1

Symmetry(i ii) = (ii i)

Convention: score is w.r.t. the first player Max. Min’s score = – Max

who is to move at this state

Two players: Max and Min

Max wants the largest scoreMin wants the smallest score

Page 9: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 9

The game tree for II-Nim

(ii ii) Max

(i ii) Min (- ii) Min

(i i) Max (- ii) Max (- i) Max (- i) Max (- -) Max

+1

(- i) Min

+1 (- -) Min

-1(- i) Min (- -) Min

-1(- -) Min

-1

(- -) Max

+1(- -) Max

+1

Symmetry(i ii) = (ii i)

Convention: score is w.r.t. the first player Max. Min’s score = – Max

who is to move at this state

Two players: Max and Min

Max wants the largest scoreMin wants the smallest score

Page 10: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 10

The game tree for II-Nim

(ii ii) Max

(i ii) Min (- ii) Min

(i i) Max +1 (- ii) Max

+1 (- i) Max

-1 (- i) Max

-1 (- -) Max

+1

(- i) Min

+1 (- -) Min

-1

(- i) Min +1

(- -) Min -1

(- -) Min

-1

(- -) Max

+1(- -) Max

+1

Symmetry(i ii) = (ii i)

Convention: score is w.r.t. the first player Max. Min’s score = – Max

who is to move at this state

Two players: Max and Min

Max wants the largest scoreMin wants the smallest score

Page 11: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 11

The game tree for II-Nim

(ii ii) Max

-1

(i ii) Min

-1

(- ii) Min

-1

(i i) Max +1 (- ii) Max

+1 (- i) Max

-1 (- i) Max

-1 (- -) Max

+1

(- i) Min

+1 (- -) Min

-1

(- i) Min +1

(- -) Min -1

(- -) Min

-1

(- -) Max

+1(- -) Max

+1

Symmetry(i ii) = (ii i)

Convention: score is w.r.t. the first player Max. Min’s score = – Max

who is to move at this state

Two players: Max and Min

Max wants the largest scoreMin wants the smallest score

Page 12: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 12

The game tree for II-Nim

(ii ii) Max

-1

(i ii) Min

-1

(- ii) Min

-1

(i i) Max +1 (- ii) Max

+1 (- i) Max

-1 (- i) Max

-1 (- -) Max

+1

(- i) Min

+1 (- -) Min

-1(- i) Min +1

(- -) Min -1

(- -) Min

-1

(- -) Max

+1(- -) Max

+1

Symmetry(i ii) = (ii i)

Convention: score is w.r.t. the first player Max. Min’s score = – Max

who is to move at this state

Two players: Max and Min

Max wants the largest scoreMin wants the smallest score

Major difference from standard search:

The opponent has control over which

action to take, when it’s his turn.

The first player always loses, if the second player plays optimally

Page 13: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 13

Game theoretic value

• Game theoretic value (a.k.a. minimax value) of a node = the score of the terminal node that will be reached if both players play optimally.

• = The numbers we filled in.

• Computed bottom up In Max’s turn, take the max of the children (Max

will pick that maximizing action) In Min’s turn, take the min of the children (Min will

pick that minimizing action)

• Implemented as a modified version of DFS: minimax algorithm

Page 14: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 14

Minimax algorithm

function Max-Value(s)inputs:

s: current state in game, Max about to playoutput: best-score (for Max) available from s

if ( s is a terminal state )then return ( terminal value of s )else

α := – ∞for each s’ in Succ(s) α := max( α , Min-value(s’))

return α

function Min-Value(s)output: best-score (for Min) available from s

if ( s is a terminal state )then return ( terminal value of s)else

β := ∞for each s’ in Succs(s) β := min( β , Max-value(s’))

return β

• Time complexity?

• Space complexity?

Page 15: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 15

Minimax example

A

O

W-3

B

N4

F G-5

X-5

ED0C

R0

P9

Q-6

S3

T5

U-7

V-9

K MH3

I8 J L

2

max

min

max

min

max

Page 16: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 16

Minimax algorithm

function Max-Value(s)inputs:

s: current state in game, Max about to playoutput: best-score (for Max) available from s

if ( s is a terminal state )then return ( terminal value of s )else

α := – ∞for each s’ in Succ(s) α := max( α , Min-value(s’))

return α

function Min-Value(s)output: best-score (for Min) available from s

if ( s is a terminal state )then return ( terminal value of s)else

β := ∞for each s’ in Succs(s) β := min( β , Max-value(s’))

return β

• Time complexity? O(bm) bad

• Space complexity? O(bm)

Page 17: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 17

Against a dumber opponent?

• Max surely loses!

• If Min not optimal,

• Which way?

• Why?

(ii ii) Max

-1

(i ii) Min

-1

(- ii) Min

-1

(i i) Max +1 (- ii) Max

+1 (- i) Max

-1 (- i) Max

-1 (- -) Max

+1

(- i) Min

+1 (- -) Min

-1

(- i) Min +1

(- -) Min -1

(- -) Min

-1

(- -) Max

+1(- -) Max

+1

Page 18: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 18

Next: alpha-beta pruning

Gives the same game theoretic values as minimax, but prunes part of the game tree.

"If you have an idea that is surely bad, don't take the time to see how truly awful it is." -- Pat Winston

Page 19: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 19

Alpha-Beta Motivation

S

A100

C200

D100

B

E120

F20

max

min

• Depth-first order

• After returning from A, Max can get at least 100 at S

• After returning from F, Max can get at most 20 at B

• At this point, Max losts interest in B

• There is no need to explore G. The subtree at G is pruned. Saves time.

G

Page 20: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 20

Alpha-beta pruning

function Max-Value (s,α,β)inputs:

s: current state in game, Max about to playα: best score (highest) for Max along path to sβ: best score (lowest) for Min along path to s

output: min(β , best-score (for Max) available from s)

if ( s is a terminal state ) then return ( terminal value of s )

else for each s’ in Succ(s) α := max( α , Min-value(s’,α,β)) if ( α ≥ β ) then return β /* alpha pruning */

return α

function Min-Value(s,α,β)output: max(α , best-score (for Min) available from s )

if ( s is a terminal state ) then return ( terminal value of s)

else for each s’ in Succs(s) β := min( β , Max-value(s’,α,β))

if (α ≥ β ) then return α /* beta pruning */return β

Starting from the root:Max-Value(root, -∞ , +∞)

Page 21: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 21

Alpha pruning example

S

A100

C200

D100

B

E120

F20

max

min

G

α=-∞β=+∞

• Keep two bounds along the path α: the best Max can do β: the best (smallest) Min can do

• If at anytime α exceeds β, the remaining children are pruned.

Page 22: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 22

Alpha pruning example

S

A100

C200

D100

B

E120

F20

max

min

G

α=-∞β=+∞

α=-∞β=+∞

Page 23: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 23

Alpha pruning example

S

A100

C200

D100

B

E120

F20

max

min

G

α=-∞β=+∞

α=-∞β=200

Page 24: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 24

Alpha pruning example

S

A100

C200

D100

B

E120

F20

max

min

G

α=-∞β=+∞

α=-∞β=100

Page 25: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 25

Alpha pruning example

S

A100

C200

D100

B

E120

F20

max

min

G

α=100β=+∞

α=-∞β=100

α=100β=+∞

Page 26: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 26

Alpha pruning example

S

A100

C200

D100

B

E120

F20

max

min

G

α=100β=+∞

α=-∞β=100

α=100β=120

Page 27: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 27

Alpha pruning example

S

A100

C200

D100

B

E120

F20

max

min

G

α=100β=+∞

α=-∞β=100

α=100β=20

X

Page 28: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 28

Beta pruning example

A

B20

D20

E-10

C25

F-20

G25

max

min

max

H

• Keep two bounds along the path α: the best Max can do β: the best (smallest) Min can do

• If at anytime α exceeds β, the remaining children are pruned.

S

X

Page 29: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 29

Yet another alpha-beta pruning example• Keep two bounds along the path

α: the best Max can do on the path β: the best (smallest) Min can do on the path

• If at anytime α exceeds β, the remaining children are pruned.

AAAα=-

O

W-3

B

N4

F G-5

X-5

ED0C

R0

P9

Q-6

S3

T5

U-7

V-9

K MH3

I8 J L

2

max– = – ∞+ = + ∞

[Example from James Skrentny]

Page 30: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 30

Alpha-beta pruning example• Keep two bounds along the path

α: the best Max can do on the path β: the best (smallest) Min can do on the path

• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.

BBBβ=+

O

W-3

N4

F G-5

X-5

ED0C

R0

P9

Q-6

S3

T5

U-7

V-9

K MH3

I8 J L

2

Aα=-

max

min

[Example from James Skrentny]

Page 31: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 31

Alpha-beta pruning example• Keep two bounds along the path

α: the best Max can do on the path β: the best (smallest) Min can do on the path

• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.

FFFα=-

O

W-3

Bβ=+

N4

G-5

X-5

ED0C

R0

P9

Q-6

S3

T5

U-7

V-9

K MH3

I8 J L

2

Aα=-

max

min

max

[Example from James Skrentny]

Page 32: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 32

Alpha-beta pruning example• Keep two bounds along the path

α: the best Max can do on the path β: the best (smallest) Min can do on the path

• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.

[Example from James Skrentny]

O

W-3

Bβ=+

N4

Fα=-

G-5

X-5

ED0C

R0

P9

Q-6

S3

T5

U-7

V-9

K MH3

I8 J L

2

Aα=-

max

N4

min

max

Page 33: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 33

Alpha-beta pruning example• Keep two bounds along the path

α: the best Max can do on the path β: the best (smallest) Min can do on the path

• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.

[Example from James Skrentny]

Fα=-

Fα=4

O

W-3

Bβ=+

N4

G-5

X-5

ED0C

R0

P9

Q-6

S3

T5

U-7

V-9

K MH3

I8 J L

2

Aα=-

max

min

max

α updated

Page 34: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 34

Alpha-beta pruning example• Keep two bounds along the path

α: the best Max can do on the path β: the best (smallest) Min can do on the path

• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.

[Example from James Skrentny]

OOOβ=+

W-3

Bβ=+

N4

Fα=4

G-5

X-5

ED0C

R0

P9

Q-6

S3

T5

U-7

V-9

K MH3

I8 J L

2

Aα=-

max

min

max

min

Page 35: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 35

Alpha-beta pruning example• Keep two bounds along the path

α: the best Max can do on the path β: the best (smallest) Min can do on the path

• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.

[Example from James Skrentny]

Oβ=+

W-3

Bβ=+

N4

Fα=4

G-5

X-5

ED0C

R0

P9

Q-6

S3

T5

U-7

K MH3

I8 J L

2

Aα=-

max

min

W-3

min

V-9

Page 36: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 36

Alpha-beta pruning example• Keep two bounds along the path

α: the best Max can do on the path β: the best (smallest) Min can do on the path

• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.

[Example from James Skrentny]

Oβ=+

Oβ=-3

W-3

Bβ=+

N4

Fα=4

G-5

X-5

ED0C

R0

P9

Q-6

S3

T5

U-7

V-9

K MH3

I8 J L

2

Aα=-

max

min

max

min

Page 37: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 37

Alpha-beta pruning example• Keep two bounds along the path

α: the best Max can do on the path β: the best (smallest) Min can do on the path

• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.

[Example from James Skrentny]

Oβ=-3

W-3

Bβ=+

N4

Fα=4

G-5

X-5

ED0C

R0

P9

Q-6

S3

T5

U-7

V-9

K MH3

I8 J L

2

Aα=-

max

min

max

minX-5

Pruned!

Page 38: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 38

Alpha-beta pruning example• Keep two bounds along the path

α: the best Max can do on the path β: the best (smallest) Min can do on the path

• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.

[Example from James Skrentny]

Bβ=+

Bβ=4

Oβ=-3

W-3

N4

Fα=4

G-5

X-5

ED0C

R0

P9

Q-6

S3

T5

U-7

V-9

K MH3

I8 J L

2

Aα=-

max

min

max

minX-5

Page 39: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 39

Alpha-beta pruning example• Keep two bounds along the path

α: the best Max can do on the path β: the best (smallest) Min can do on the path

• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.

[Example from James Skrentny]

Oβ=-3

W-3

Bβ=4

N4

Fα=4

G-5

X-5

ED0C

R0

P9

Q-6

S3

T5

U-7

V-9

K MH3

I8 J L

2

Aα=-

max

min

max

minX-5

G-5

Page 40: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 40

Alpha-beta pruning example• Keep two bounds along the path

α: the best Max can do on the path β: the best (smallest) Min can do on the path

• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.

[Example from James Skrentny]

Oβ=-3

W-3

Bβ=-5

N4

Fα=4

G-5

X-5

ED0C

R0

P9

Q-6

S3

T5

U-7

V-9

K MH3

I8 J L

2

Aα=-

max

min

max

minX-5

G-5

Page 41: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 41

Alpha-beta pruning example• Keep two bounds along the path

α: the best Max can do on the path β: the best (smallest) Min can do on the path

• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.

[Example from James Skrentny]

Oβ=-3

W-3

Bβ=-5

N4

Fα=4

G-5

X-5

ED0C

R0

P9

Q-6

S3

T5

U-7

V-9

K MH3

I8 J L

2

Aα=-5

max

min

max

minX-5

G-5

Page 42: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 42

Alpha-beta pruning example• Keep two bounds along the path

α: the best Max can do on the path β: the best (smallest) Min can do on the path

• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.

[Example from James Skrentny]

CCCβ=+

Oβ=-3

W-3

Bβ=-5

N4

Fα=4

G-5

X-5

ED0

R0

P9

Q-6

S3

T5

U-7

V-9

K MH3

I8 J L

2

Aα=

max

min

max

minX-5

Aα=-5

Page 43: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 43

Alpha-beta pruning example• Keep two bounds along the path

α: the best Max can do on the path β: the best (smallest) Min can do on the path

• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.

[Example from James Skrentny]

Oβ=-3

W-3

Bβ=-5

N4

Fα=4

G-5

X-5

ED0

Cβ=+

R0

P9

Q-6

S3

T5

U-7

V-9

K MH3

I8 J L

2

Aα=

max

min

max

minX-5

Aα=-5

H3

Page 44: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 44

Alpha-beta pruning example• Keep two bounds along the path

α: the best Max can do on the path β: the best (smallest) Min can do on the path

• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.

[Example from James Skrentny]

Cβ=+

Cβ=3

Oβ=-3

W-3

Bβ=-5

N4

Fα=4

G-5

X-5

ED0

R0

P9

Q-6

S3

T5

U-7

V-9

K MH3

I8 J L

2

Aα=

max

min

max

minX-5

Aα=-5

Page 45: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 45

Alpha-beta pruning example• Keep two bounds along the path

α: the best Max can do on the path β: the best (smallest) Min can do on the path

• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.

[Example from James Skrentny]

Oβ=-3

W-3

Bβ=-5

N4

Fα=4

G-5

X-5

ED0

Cβ=3

R0

P9

Q-6

S3

T5

U-7

V-9

K MH3

I8 J L

2

Aα=

max

min

max

minX-5

Aα=-5

I8

Page 46: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 46

Alpha-beta pruning example• Keep two bounds along the path

α: the best Max can do on the path β: the best (smallest) Min can do on the path

• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.

[Example from James Skrentny]

JJJα=-5

Oβ=-3

W-3

Bβ=-5

N4

Fα=4

G-5

X-5

ED0

Cβ=3

R0

P9

Q-6

S3

T5

U-7

V-9

K MH3

I8

L2

Aα=

max

min

minX-5

Aα=-5

Page 47: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 47

Alpha-beta pruning example• Keep two bounds along the path

α: the best Max can do on the path β: the best (smallest) Min can do on the path

• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.

[Example from James Skrentny]

Oβ=-3

W-3

Bβ=-5

N4

Fα=4

G-5

X-5

ED0

Cβ=3

R0

P9

Q-6

S3

T5

U-7

V-9

K MH3

I8

Jα=-5

L2

Aα=

max

min

max

minX-5

Aα=-5

P9

Page 48: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 48

Alpha-beta pruning example• Keep two bounds along the path

α: the best Max can do on the path β: the best (smallest) Min can do on the path

• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.

[Example from James Skrentny]

Jα=-

Jα=9

Oβ=-3

W-3

Bβ=-5

N4

Fα=4

G-5

X-5

ED0

Cβ=3

R0

P9

Q-6

S3

T5

U-7

V-9

K MH3

I8

L2

Aα=

max

min

max

minX-5

Aα=-5

Q-6

R0

Page 49: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 49

Alpha-beta pruning example• Keep two bounds along the path

α: the best Max can do on the path β: the best (smallest) Min can do on the path

• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.

[Example from James Skrentny]

Jα=-

Jα=9

Oβ=-3

W-3

Bβ=-5

N4

Fα=4

G-5

X-5

ED0

Cβ=3

R0

P9

Q-6

S3

T5

U-7

V-9

K MH3

I8

L2

Aα=

max

min

max

minX-5

Aα=3

Q-6

R0

Page 50: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 50

Alpha-beta pruning example• Keep two bounds along the path

α: the best Max can do on the path β: the best (smallest) Min can do on the path

• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.

[Example from James Skrentny]

Oβ=-3

W-3

Bβ=-5

N4

Fα=4

G-5

X-5

ED0

Cβ=3

R0

P9

Q-6

S3

T5

U-7

V-9

K MH3

I8

Jα=9

L2

Aα=

max

min

max

minX-5

Aα=3

Page 51: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 51

Alpha-beta pruning example• Keep two bounds along the path

α: the best Max can do on the path β: the best (smallest) Min can do on the path

• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.

[Example from James Skrentny]

Oβ=-3

W-3

Bβ=-5

N4

Fα=4

G-5

X-5

Eβ=2

D0

Cβ=3

R0

P9

Q-6

S3

T5

U-7

V-9

Kα=5

MH3

I8

Jα=9

L2

Aα=

max

min

max

minX-5

Aα=3

Page 52: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 52

Alpha-beta pruning example• Keep two bounds along the path

α: the best Max can do on the path β: the best (smallest) Min can do on the path

• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.

[Example from James Skrentny]

Oβ=-3

W-3

Bβ=-5

N4

Fα=4

G-5

X-5

Eβ=2

D0

Cβ=3

R0

P9

Q-6

S3

T5

U-7

V-9

Kα=5

MH3

I8

Jα=9

L2

Aα=

max

min

max

minX-5

Aα=3

Page 53: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 53

How effective is alpha-beta pruning?

• Depends on the order of successors!

• In the best case, the number of nodes to search is O(bm/2), the square root of minimax’s cost.

• This occurs when each player's best move is the leftmost child.

• In DeepBlue (IBM Chess), the average branching factor was about 6 with alpha-beta instead of 35-40 without.

• The worst case is no pruning at all.

Eβ=2

S3

T5

U4

V3

Kα=5

ML2

Eβ=2

S3

T5

U4

V3

Kα=5

M α=4

L2

Eβ=2

S3

T5

U4

V3

K ML2

Aα=3

Aα=3

Aα=3

Page 54: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 54

Game-playing for large games

• We’ve seen how to find game theoretic values. But it is too expensive for large games.

• What do real chess-playing programs do? They can’t possibly search the full game tree They must respond in limited time They can’t pre-compute a solution

Page 55: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 55

Game-playing for large games

• The most popular solution: heuristic evaluation functions for games ‘Leaves’ are intermediate nodes at a depth cutoff,

not terminals Heuristically estimate their values Huge amount of knowledge engineering (R&N 6.4) Example: Tic-Tac-Toe: (number of 3-lengths open for me)-(number of 3-lengths open for you)

• Each move is a new depth-cutoff game-tree search (as opposed to search the complete game-tree once).

• Depth-cutoff can increase using iterative deepening, as long as there is time left.

Page 56: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 56

More on large games

• Battle the limited search depth Horizon effect: things can suddenly get much

worse just outside your search depth (‘horizon’), but you can’t see that

Quiescence / secondary search: select the most ‘interesting’ nodes at the search boundary, expand them further beyond the search depth

• Incorporate book moves Pre-compute / record opening moves, end games

Page 57: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 57

• There is an element of chance (coin flip, dice roll, etc.)

• “Chance node” in game tree, besides Max and Min nodes. Neither player makes a choice. Instead a random choice is made according to the outcome probabilities.

Two-player zero-sum discrete finite NONdeterministic games of perfect information

Max

chance

Min Min

-20 +4

Min

chance

+3

Max

+10

Max

-5

Max

p=0.8 p=0.2

p=0.5 p=0.5

Page 58: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 58

Solving non-deterministic games

• Easy to extend minimax to non-deterministic games

• At chance node, instead of using max() or min(), compute the average (weighted by the probabilities).

• What’s the value for the chance node at right?

• What action should Max take at root?

• The play will be optimal. In what sense?

Max

chance

Min Min

-20 +4

Min

chance

+3

Max

+10

Max

-5

Max

p=0.8 p=0.2

p=0.5 p=0.54*0.5+(-20)*0.5 = -8

Page 59: Game Playing - University of Rochesterjliu/CSC-242/games.pdf · slide 4 Two-player zero-sum discrete finite deterministic games of perfect information Definitions: •Zero-sum: one

slide 59

What you should know

• What is a two-player zero-sum discrete finite deterministic game of perfect information

• What is a game tree

• What is the minimax value of a game

• Minimax search

• Alpha-beta pruning

• Basic understanding of very large games

• How to extend minimax to non-deterministic games