decision making in episodic environments

Decision making in episodic environments

• We have just looked at decision making in sequential environments

• Now let’s consider the “easier” problem of episodic environments– The agent gets a series of unrelated problem

instances and has to make some decision or inference about each of them

– This is what most of “machine learning” is about

Example: Image classification

tomato

input desired output

Example: Spam Filter

Example: Seismic data

Body wave magnitude

Nuclear explosions

Earthquakes

The basic classification framework

y = f(x)

• Learning: given a training set of labeled examples {(x1,y1), …, (xN,yN)}, estimate the parameters of the prediction function f

• Inference: apply f to a never before seen test example x and output the predicted value y = f(x)

output classification function

Example: Training and testing

• Key challenge: generalization to unseen examples

Training set (labels known) Test set (labels unknown)

Naïve Bayes classifier

)|()(maxarg

)|(maxarg)(

A single dimension or attribute of x

Decision tree classifier

Example problem: decide whether to wait for a table at a restaurant, based on the following attributes:1. Alternate: is there an alternative restaurant nearby?

2. Bar: is there a comfortable bar area to wait in?

3. Fri/Sat: is today Friday or Saturday?

4. Hungry: are we hungry?

5. Patrons: number of people in the restaurant (None, Some, Full)

6. Price: price range ($, $$, $$$)

7. Raining: is it raining outside?

8. Reservation: have we made a reservation?

9. Type: kind of restaurant (French, Italian, Thai, Burger)

10. WaitEstimate: estimated waiting time (0-10, 10-30, 30-60, >60)

Decision tree classifier

Nearest neighbor classifier

f(x) = label of the training example nearest to x

• All we need is a distance function for our inputs• No training required!

Test example

Training examples

from class 1

Training examples

from class 2

Linear classifier

• Find a linear function to separate the classes

f(x) = sgn(w1x1 + w2x2 + … + wDxD) = sgn(w x)

Perceptron

Weights

Output: sgn(wx + b)

Linear separability

Multi-Layer Neural Network

• Can learn nonlinear functions• Training: find network weights to minimize the error between true and

estimated labels of training examples:

• Minimization can be done by gradient descent provided f is differentiable– This training method is called back-propagation

iii fyfE

2)()( x

Differentiable perceptron

Sigmoid function:

Weights

Output: (wx + b)

Review: Types of classifiers

• Naïve Bayes • Decision tree• Nearest neighbor• Linear classifier• Nonlinear classifier

decision making in episodic environments

inputsno training

observations decision

test exampletraining

fx learning

distance function

alternative restaurant

test example x

machine learning

Documents

tulving episodic semantic

homo heuristicus: robust decision making in uncertain...

analysing behavioural responses to policy change in dynamic...

continuous decision improvement (cdi): public health...

decision making in complex environments...decision making in...

age-related differences in episodic memory retrieval erp...

the evolution of decision rules in complex environments ·...

learning analytics. data-based decision making in learning...

primary episodic ataxias

spatial decision support in urban environments using

robust decision making in uncertain environments

robust decision making in uncertain environments henry...

episodic memory in lifelong language...

analytics for decision making in learning environments

leadership and decision making in safety-critical...

assessment of eutrophication in estuaries and coastal waters...

decision analysis steps in decision making decision analysis...

continuities and discontinuities between imagination and...

introduction to decision analysisintroduction to decision...

technology decision making for digital learning...