1 monte-carlo methods in ai: overview prasad tadepalli

5
1 Monte-Carlo Methods in AI: Overview Prasad Tadepalli

Upload: steven-sparks

Post on 14-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Monte-Carlo Methods in AI: Overview Prasad Tadepalli

1

Monte-Carlo Methods in AI: Overview

Prasad Tadepalli

Page 2: 1 Monte-Carlo Methods in AI: Overview Prasad Tadepalli

What is a Monte-Carlo Method? Any method that relies on repeated random

simulations to estimate something

Simplest case: Polling – who wins the election? True probability of a person voting for Obama is Ask N = 1000 random registered voters how they vote. Calculate = #(Obama voters)/1000

Apply Chernoff’s bound

Key idea: Although people are complex and varied, they can be treated as independent samples of an identical distribution for estimation

2Pr ( ) ( ) expP Obama P Obama N

( )P Obama

( )P Obama

Page 3: 1 Monte-Carlo Methods in AI: Overview Prasad Tadepalli

Applications First modern use in simulating nuclear

reactions in 1940’s by Stanislaw Ulam

Predicting the behavior of complex systems – weather, finance, fluid dynamics, markets, …

Planning and optimization - Computer games: Bridge, Go, Solitaire, StarCraft Optimal path planning in time-sensitive networks True model either does not exist or is too

complicated to reason about

Page 4: 1 Monte-Carlo Methods in AI: Overview Prasad Tadepalli

Two Fundamental Problems Prediction/Inference Problem

Given a probabilistic model of how the world operates (a “Bayesian Network”) and some observed evidence, what can we infer about a particular query variable?

Draw samples of the model where the observed evidence is true

Estimate the number of times the query variable is true

Planning/Optimization Problem Given a faithful simulator of an environment, how can we

use it to choose an optimal action? Run lots and lots of trials Combine the evidence in a “smart” way Output the action that yields best results

Page 5: 1 Monte-Carlo Methods in AI: Overview Prasad Tadepalli

Organization

Monday, Tuesday, Wednesday are divided into 2 parts Mornings

Inference/Prediction Problem (Experiments with Genie) Application Talk

Afternoons Planning/Optimization Problem (Experiments with MCP) Project/Lab (Galcon)

Wednesday evening dinner @5:30, McMenamins, Monroe

Thursday 2 talks plus tournament project work Tournament code is due: Friday 9 AM. Friday – Advanced topics, tournaments, student

presentations