what did alphago do to beat the strongest human go player?

157
March 2016

Upload: tobias-pfeiffer

Post on 07-Jan-2017

607 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: What did AlphaGo do to beat the strongest human Go player?

March 2016

Page 2: What did AlphaGo do to beat the strongest human Go player?

Mainstream Media

Page 3: What did AlphaGo do to beat the strongest human Go player?

1997

Page 4: What did AlphaGo do to beat the strongest human Go player?

Ing cup 1985 – 2000(up to 1,400,000$)

(1985-2000)

Page 5: What did AlphaGo do to beat the strongest human Go player?

5d win 1998

Page 6: What did AlphaGo do to beat the strongest human Go player?

October 2015

Page 7: What did AlphaGo do to beat the strongest human Go player?

This is the first time that a computer program has defeated a human professional player in the

full-sized game of Go, a featpreviously thought to be at

least a decade away.

Silver, D. et al., 2016. Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587), p.484-489.

January 2016

Page 8: What did AlphaGo do to beat the strongest human Go player?

What did AlphaGo do to beat the strongest human Go player?

Tobias [email protected]

Page 9: What did AlphaGo do to beat the strongest human Go player?

Go

Page 10: What did AlphaGo do to beat the strongest human Go player?

Computational Challenge

Page 11: What did AlphaGo do to beat the strongest human Go player?

Monte Carlo Method

Page 12: What did AlphaGo do to beat the strongest human Go player?

Neural Networks

Page 13: What did AlphaGo do to beat the strongest human Go player?

Revolution with Neural Networks

Page 14: What did AlphaGo do to beat the strongest human Go player?

What did we learn?

Page 15: What did AlphaGo do to beat the strongest human Go player?

Go

Page 16: What did AlphaGo do to beat the strongest human Go player?
Page 17: What did AlphaGo do to beat the strongest human Go player?
Page 18: What did AlphaGo do to beat the strongest human Go player?
Page 19: What did AlphaGo do to beat the strongest human Go player?
Page 20: What did AlphaGo do to beat the strongest human Go player?
Page 21: What did AlphaGo do to beat the strongest human Go player?
Page 22: What did AlphaGo do to beat the strongest human Go player?
Page 23: What did AlphaGo do to beat the strongest human Go player?
Page 24: What did AlphaGo do to beat the strongest human Go player?
Page 25: What did AlphaGo do to beat the strongest human Go player?
Page 26: What did AlphaGo do to beat the strongest human Go player?
Page 27: What did AlphaGo do to beat the strongest human Go player?
Page 28: What did AlphaGo do to beat the strongest human Go player?
Page 29: What did AlphaGo do to beat the strongest human Go player?
Page 30: What did AlphaGo do to beat the strongest human Go player?
Page 31: What did AlphaGo do to beat the strongest human Go player?
Page 32: What did AlphaGo do to beat the strongest human Go player?
Page 33: What did AlphaGo do to beat the strongest human Go player?
Page 34: What did AlphaGo do to beat the strongest human Go player?
Page 35: What did AlphaGo do to beat the strongest human Go player?

Computational Challenge

Page 36: What did AlphaGo do to beat the strongest human Go player?

Go vs. Chess

Page 37: What did AlphaGo do to beat the strongest human Go player?

Complex vs. Complicated

Page 38: What did AlphaGo do to beat the strongest human Go player?

„While the Baroque rules of chess could only have been created by humans, the rules of

go are so elegant, organic, and rigorously logical that if intelligent life forms exist

elsewhere in the universe, they almost certainly play go.“

Edward Lasker (chess grandmaster)

Page 39: What did AlphaGo do to beat the strongest human Go player?

Larger board19x19 vs. 8x8

Page 40: What did AlphaGo do to beat the strongest human Go player?

Almost every move is legal

Page 41: What did AlphaGo do to beat the strongest human Go player?

Average branching factor:250 vs 35

Page 42: What did AlphaGo do to beat the strongest human Go player?

State Space Complexity:10 171 vs 1047

Page 43: What did AlphaGo do to beat the strongest human Go player?

1080

Page 44: What did AlphaGo do to beat the strongest human Go player?

Global impact of moves

Page 45: What did AlphaGo do to beat the strongest human Go player?

68957966354765

685766345

857635

563

6

MAX

MIN

MAX

MIN

MAX

Traditional Seach

Page 46: What did AlphaGo do to beat the strongest human Go player?

68957966354765

685766345

857635

563

6

MAX

MIN

MAX

MIN

MAX

Evaluation Function

Page 47: What did AlphaGo do to beat the strongest human Go player?
Page 48: What did AlphaGo do to beat the strongest human Go player?

Monte Carlo Method

Page 49: What did AlphaGo do to beat the strongest human Go player?

What is Pi?

Page 50: What did AlphaGo do to beat the strongest human Go player?

How do you determine Pi?

Page 51: What did AlphaGo do to beat the strongest human Go player?
Page 52: What did AlphaGo do to beat the strongest human Go player?

2006

Page 53: What did AlphaGo do to beat the strongest human Go player?

Browne, Cb, and Edward Powley. 2012. A survey of monte carlo tree search methods. Intelligence and AI 4, no. 1: 1-49

Page 54: What did AlphaGo do to beat the strongest human Go player?

2/4

1/1 0/1 1/1 0/1

A1D5

F13C7

Page 55: What did AlphaGo do to beat the strongest human Go player?

2/4

1/1 0/1 1/1 0/1

A1D5

F13C7

Selection

Page 56: What did AlphaGo do to beat the strongest human Go player?

2/4

1/1 0/1 1/1 0/1

A1D5

F13C7

0/0

B5

Expansion

Page 57: What did AlphaGo do to beat the strongest human Go player?

2/4

1/1 0/1 1/1 0/1

A1D5

F13C7

0/0

B5

Simulation

Page 58: What did AlphaGo do to beat the strongest human Go player?

Random

Page 59: What did AlphaGo do to beat the strongest human Go player?

Not Human like?

Page 60: What did AlphaGo do to beat the strongest human Go player?

3/5

2/2 0/1 1/1 0/1

A1D5

F13C7

1/1

B5

Backpropagation

Page 61: What did AlphaGo do to beat the strongest human Go player?

3/5

2/2 0/1 1/1 0/1

A1D5

F13C7

1/1

B5

Perspective

Page 62: What did AlphaGo do to beat the strongest human Go player?

2/5

1/2 0/1 1/1 0/1

A1D5

F13C7

1/1

B5

Perspective

Page 63: What did AlphaGo do to beat the strongest human Go player?

Multi Armed Bandit

Page 64: What did AlphaGo do to beat the strongest human Go player?

Multi Armed Bandit

Exploitation vs Exploration

Page 65: What did AlphaGo do to beat the strongest human Go player?

winsvisits

+explorationFactor √ ln (totalVisits)visits

Page 66: What did AlphaGo do to beat the strongest human Go player?

15042

86/193

0/1 1/2 0/2

36/1116

2/2

58/151

1/2 0/23/3

Page 67: What did AlphaGo do to beat the strongest human Go player?

15042

86/193

0/1 1/2 0/2

36/1116

2/2

58/151

1/2 0/23/3

Page 68: What did AlphaGo do to beat the strongest human Go player?

15042

86/193

0/1 1/2 0/2

36/1116

2/2

58/151

1/2 0/23/3

Page 69: What did AlphaGo do to beat the strongest human Go player?

Generate a valid random move

Page 70: What did AlphaGo do to beat the strongest human Go player?

Who has won?

Page 71: What did AlphaGo do to beat the strongest human Go player?
Page 72: What did AlphaGo do to beat the strongest human Go player?

General Game Playing

Page 73: What did AlphaGo do to beat the strongest human Go player?

Anytime

Page 74: What did AlphaGo do to beat the strongest human Go player?

Lazy

Page 75: What did AlphaGo do to beat the strongest human Go player?
Page 76: What did AlphaGo do to beat the strongest human Go player?

Expert Knowledge

Page 77: What did AlphaGo do to beat the strongest human Go player?

Neural Networks

Page 78: What did AlphaGo do to beat the strongest human Go player?

2014

Page 79: What did AlphaGo do to beat the strongest human Go player?

What does this even mean?

Page 80: What did AlphaGo do to beat the strongest human Go player?

Neural Networks

Page 81: What did AlphaGo do to beat the strongest human Go player?

Input“Hidden” Layer

Output

Neural Networks

Page 82: What did AlphaGo do to beat the strongest human Go player?

Weights

Page 83: What did AlphaGo do to beat the strongest human Go player?

Bias/Threshold

Page 84: What did AlphaGo do to beat the strongest human Go player?

4

2

-3

3.2

Activation

Page 85: What did AlphaGo do to beat the strongest human Go player?

5.2 >= 4

2

-3

3.2

Activation

Page 86: What did AlphaGo do to beat the strongest human Go player?

2.2 <= 4

2

-3

3.2

Activation

Page 87: What did AlphaGo do to beat the strongest human Go player?

Activation

Page 88: What did AlphaGo do to beat the strongest human Go player?

Training

Page 89: What did AlphaGo do to beat the strongest human Go player?

Adjust parameters

Page 90: What did AlphaGo do to beat the strongest human Go player?

Supervised Learning

InputExpectedOutput

Page 91: What did AlphaGo do to beat the strongest human Go player?

Backpropagation

Page 92: What did AlphaGo do to beat the strongest human Go player?

Data set

Page 93: What did AlphaGo do to beat the strongest human Go player?

Training data + test data

Page 94: What did AlphaGo do to beat the strongest human Go player?

Training

Page 95: What did AlphaGo do to beat the strongest human Go player?

Verify

Page 96: What did AlphaGo do to beat the strongest human Go player?

Overfitting

Page 97: What did AlphaGo do to beat the strongest human Go player?

Deep Neural Networks

Page 98: What did AlphaGo do to beat the strongest human Go player?

Convolutional Neural Networks

Page 99: What did AlphaGo do to beat the strongest human Go player?

Local Receptive Field

Page 100: What did AlphaGo do to beat the strongest human Go player?

Feature Map

Page 101: What did AlphaGo do to beat the strongest human Go player?

Stride

Page 102: What did AlphaGo do to beat the strongest human Go player?

Shared weights and biases

Page 103: What did AlphaGo do to beat the strongest human Go player?

19 x 19 3 x 17 x 17

Multiple Feature maps/filters

Page 104: What did AlphaGo do to beat the strongest human Go player?

Architecture

...

Input Features

12 layers with 64 – 192 filters

Output

Page 105: What did AlphaGo do to beat the strongest human Go player?

Architecture

...

Input Features

12 layers with 64 – 192 filters

Output

Page 106: What did AlphaGo do to beat the strongest human Go player?

Architecture

...

Input Features

12 layers with 64 – 192 filters

Output

Page 107: What did AlphaGo do to beat the strongest human Go player?

2.3 million parameters630 million connections

Page 108: What did AlphaGo do to beat the strongest human Go player?

● Stone Colour x 3● Liberties x 4● Liberties after move played x 6● Legal Move x 1● Turns since x 5● Capture Size x 7 ● Ladder Move x 1● KGS Rank x 9

Input Features

Page 109: What did AlphaGo do to beat the strongest human Go player?

Training on game data predicting the next move

Page 110: What did AlphaGo do to beat the strongest human Go player?

55% Accuracy

Page 111: What did AlphaGo do to beat the strongest human Go player?

Mostly beats GnuGo

Page 112: What did AlphaGo do to beat the strongest human Go player?

Combined with MCTS in the Selection

Page 113: What did AlphaGo do to beat the strongest human Go player?

Asynchronous GPU Power

Page 114: What did AlphaGo do to beat the strongest human Go player?

Revolution

Page 115: What did AlphaGo do to beat the strongest human Go player?

Silver, D. et al., 2016. Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587), p.484-489.

Networks in Training

Page 116: What did AlphaGo do to beat the strongest human Go player?

Silver, D. et al., 2016. Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587), p.484-489.

Networks in Training

Page 117: What did AlphaGo do to beat the strongest human Go player?

Silver, D. et al., 2016. Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587), p.484-489.

AlphaGo Search

Page 118: What did AlphaGo do to beat the strongest human Go player?

Selection

Page 119: What did AlphaGo do to beat the strongest human Go player?

Action ValuePrior Probability

Visit Count

Selection

Page 120: What did AlphaGo do to beat the strongest human Go player?

Action ValuePrior Probability

Visit Count

Page 121: What did AlphaGo do to beat the strongest human Go player?

Action ValuePrior Probability

Visit Count

Selection

Page 122: What did AlphaGo do to beat the strongest human Go player?

Action ValuePrior Probability

Visit Count

Selection

Page 123: What did AlphaGo do to beat the strongest human Go player?

0.8

1.2 0.5 1.1 0.9

Action Value + Bonues

Page 124: What did AlphaGo do to beat the strongest human Go player?

0.8

1.2 0.5 1.1 0.9

Expansion

Page 125: What did AlphaGo do to beat the strongest human Go player?

0.8

1.2 0.5 1.1 0.9

Expansion

Page 126: What did AlphaGo do to beat the strongest human Go player?

0.8

1.2 0.5 1.1 0.9

Prior Probability

Page 127: What did AlphaGo do to beat the strongest human Go player?

0.8

1.2 0.5 1.1 0.9

Evalution

Page 128: What did AlphaGo do to beat the strongest human Go player?

0.8

1.2 0.5 1.1 0.9

Evalution

Page 129: What did AlphaGo do to beat the strongest human Go player?

0.8

1.2 0.5 1.1 0.9

Rollout

Page 130: What did AlphaGo do to beat the strongest human Go player?

0.8

1.2 0.5 1.1 0.9

Value Network

Page 131: What did AlphaGo do to beat the strongest human Go player?

0.81

1.3 0.5 1.1 0.9

Backup

1 .6

Page 132: What did AlphaGo do to beat the strongest human Go player?

1202 CPUs 176 GPUS0.8

1.2 0.5 1.1 0.9

Page 133: What did AlphaGo do to beat the strongest human Go player?

Tensor

Page 134: What did AlphaGo do to beat the strongest human Go player?

Human Instinct Policy NetworkReading Capability SearchPositional Judgement Value Network

3 Strengths of AlphaGo

Page 135: What did AlphaGo do to beat the strongest human Go player?

Human Instinct Policy NetworkReading Capability SearchPositional Judgement Value Network

Most Important Strength

Page 136: What did AlphaGo do to beat the strongest human Go player?

More Natural

Page 137: What did AlphaGo do to beat the strongest human Go player?

Lee Sedol match

Page 138: What did AlphaGo do to beat the strongest human Go player?

Style

Page 139: What did AlphaGo do to beat the strongest human Go player?

So when AlphaGo plays a slack looking move,we may regard it as a mistake,

but perhaps it should more accurately be viewedas a declaration of victory?

An Younggil 8p

Page 140: What did AlphaGo do to beat the strongest human Go player?

Game 2

Page 141: What did AlphaGo do to beat the strongest human Go player?

Game 4

Page 142: What did AlphaGo do to beat the strongest human Go player?
Page 143: What did AlphaGo do to beat the strongest human Go player?

Game 4

Page 144: What did AlphaGo do to beat the strongest human Go player?

Game 4

Page 145: What did AlphaGo do to beat the strongest human Go player?

What can we learn?

Page 146: What did AlphaGo do to beat the strongest human Go player?

Making X fastervs

Doing less of X

Page 147: What did AlphaGo do to beat the strongest human Go player?

Benchmark everything

Page 148: What did AlphaGo do to beat the strongest human Go player?

Solving problems the human wayvs

Solving problems the computer way

Page 149: What did AlphaGo do to beat the strongest human Go player?

Don't blindly dismiss approaches as infeasible

Page 150: What did AlphaGo do to beat the strongest human Go player?

One Approach vs

Combination of Approaches

Page 151: What did AlphaGo do to beat the strongest human Go player?

Joy of Creation

Page 152: What did AlphaGo do to beat the strongest human Go player?

PragTob/Rubykon

Page 154: What did AlphaGo do to beat the strongest human Go player?

What did AlphaGo do to beat the strongest human Go player?

Tobias [email protected]

Page 155: What did AlphaGo do to beat the strongest human Go player?

Sources

● Maddison, C.J. et al., 2014. Move Evaluation in Go Using Deep Convolutional Neural Networks.

● Silver, D. et al., 2016. Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587), p.484-489.

● Michael A. Nielsen, "Neural Networks and Deep Learning", Determination Press, 2015 http://neuralnetworksanddeeplearning.com

● Gelly, S. & Silver, D., 2011. Monte-Carlo tree search and rapid action value estimation in computer Go. Artificial Intelligence, 175(11), p.1856-1876.

● I. Althöfer, “On the Laziness of Monte-Carlo Game Tree Search In Non-tight Situations,” Friedrich-Schiller Univ., Jena, Tech. Rep., 2008.

● Browne, C. & Powley, E., 2012. A survey of monte carlo tree search methods. IEEE Transactions on Intelligence and AI in Games, 4(1), p.1-49.

● Gelly, S. & Silver, D., 2007. Combining online and offline knowledge in UCT. Machine Learning, p.273-280.

● https://www.youtube.com/watch?v=LX8Knl0g0LE&index=9&list=WL

Page 156: What did AlphaGo do to beat the strongest human Go player?

Photo Credit● http://www.computer-go.info/events/ing/2000/images/bigcup.jpg

● https://en.wikipedia.org/wiki/File:Kasparov-29.jpg

● http://www.geforce.com/hardware/desktop-gpus/geforce-gtx-titan-black/product-images

● http://giphy.com/gifs/dark-thread-after-lCP95tGSbMmWI

● https://cloudplatform.googleblog.com/2016/05/Google-supercharges-machine-learning-tasks-with-custom-chip.html

● https://gogameguru.com/i/2016/01/Fan-Hui-vs-AlphaGo-550x364.jpg

● http://makeitstranger.com/

● CC BY 2.0

– https://en.wikipedia.org/wiki/File:Deep_Blue.jpg

– https://www.flickr.com/photos/luisbg/2094497611/

● CC BY-SA 3.0

– https://en.wikipedia.org/wiki/Alpha%E2%80%93beta_pruning#/media/File:AB_pruning.svg

● CC BY-SA 2.0

– https://flic.kr/p/cPUtny

– https://flic.kr/p/dLSKTQ

– https://www.flickr.com/photos/83633410@N07/7658272558/