artificial intelligence comp-424atombe2/classnotes... · 2009-04-16 · artificial intelligence...

Artificial IntelligenceCOMP-424

Lecture notes by Alexandre Tomberg

Prof. Joelle Pineau McGill University

Winter 2009

Lecture notes Page 1

History of AII.

Uninformed Search Methods1.Informed Search2.Search for Optimization Problems3.Game Playing4.Constraint Satisfaction5.

SearchII.

Knowledge Representation: Logic1.First Order Logic2.Planning3.Spatial Planning4.

LogicIII.

Reasoning under Uncertainty1.Bayesian Networks2.

ProbabilityIV.

Machine Learning: Parameter Estimation1.Learning with Missing Values2.Supervised Learning3.Neural Nets4.Decision Trees5.

Machine LearningV.

Utility Theory1.Markov Decision Processes (MDPs)2.Reinforcement Learning3.

Decision TheoryVI.

Table of ContentsDecember-03-08

12:16 PM

History of AIJanuary-06-09

10:03 AM

Uninformed Search MethodsJanuary-08-09

10:06 AM

Generic Search Algorithm:

Algorithm 1: BFS

Algorithm 2: DFS

Algorithm 3: Depth limited search

Algorithm 4: Iterative Deepening

Informed SearchJanuary-13-09

10:02 AM

Algorithm #1: Best-First Search

Algorithm #2: Heuristic Search

AlgorithmsJanuary-13-09

10:34 AM

Algorithm # 3: A* search

Search for Optimization ProblemsJanuary-15-09

10:05 AM

Algorithm #1: Hill Climbing

Algorithm #2: Simulated Annealing

Iterative Improvement AlgorithmsJanuary-15-09

10:05 AM

Genetic AlgorithmsJanuary-15-09

11:06 AM

Game PlayingJanuary-20-09

10:03 AM

Minimax SearchJanuary-20-09

10:07 AM

α-β PruningJanuary-20-09

10:44 AM

Constraint SatisfactionJanuary-22-09

10:10 AM

Knowledge Representation: LogicJanuary-27-0910:10 AM

First Order LogicFebruary-18-09

7:50 PM

PlanningFebruary-03-0910:11 AM

Partial Order Planning AlgorithmFebruary-18-09

8:55 PM

Least Commitment

Analysis

Spatial PlanningFebruary-03-09

10:32 AM

If we know probabilities, what actions should we choose?

Reasoning under UncertaintyFebruary-18-09

9:13 PM

Bayesian NetworksMarch-19-09

3:26 PM

Machine Learning: Parameter EstimationMarch-03-09

10:09 AM

Statistical Parameter FittingMarch-03-09

10:34 AM

Maximum Likelihood Estimate (MLE)March-03-09

10:53 AM

Learning with Missing ValuesMarch-10-09

10:14 AM

Basic EM algorithm:Start with an initial parameter setting

Expectation Step: Complete the data by assigning values to missing items.Maximization Step: Compute the maximum log-likelihood and new parameters on the complete data.

Repeat:

Soft EM for a general Bayes net:

Machine Learning: ClusteringMarch-19-09

4:21 PM

Supervised LearningMarch-10-09

10:55 AM

OverfittingApril-14-09

8:35 PM

Gradient Descent:Given w0, for i = 0, 1, 2, ... do:

Repeat until necessary.

Finding Parameters in GeneralApril-14-09

9:05 PM

Batch vs. Online OptimizationApril-14-09

9:38 PM

What we should know:

Neural NetsMarch-19-09

4:48 PM

Forward pass:

Compute the output of all units in layer kCopy this output as the input to the next layer

for layer k = 1 ... K do:

Feed Forward Neural NetworksApril-15-09

10:48 AM

Backpropagation algorithm:Forward pass: compute the output of the network going from input layer to output layer.

Backward pass: compute the gradient of the error for every weight inside the network going from output layer towards the input layer.

Update: update the weights using the standard rule:3.

Overfitting in Neural NetApril-15-09

12:56 PM

Decision TreesApril-15-091:04 PM

Utility TheoryApril-15-09

1:54 PM

Utility Models:

Maximizing Expected Utility (MEU) PrincipleApril-15-09

2:21 PM

What we should know:

Markov Decision Processes (MDPs)April-15-09

2:50 PM

PoliciesApril-15-09

2:50 PM

Iterative Policy Evaluation Algorithm:Start with some initial guess1.During iteration k update the function for all states as follows:2.

Searching for a Good PolicyApril-15-09

4:47 PM

Policy Iteration Algorithm:Start with an initial policy

Compute using policy evaluation algorithmCompute using greedy policy update rule on

Repeat until

Value Iteration Algorithm:Start with an initial value

Update the value function estimate using:Repeat until

Reinforcement LearningApril-15-09

5:38 PM

TD (order 0) Learning Algorithm:Initialize the value function:1.

Pick a start statea.

Choose an action a based on current policy π and current state si.Take action a, observe reward r and new state s'ii.Compute TD error: δ = r + γ V(s') - V(s)iii.Update the value function: V(s) = V(s) + αs δiv.Update current state: s = s'v.If s' is a terminal state, GoTo 2.vi.

Repeat for every time step tb.

Repeat until feeling sick of it:2.

Reinforcement Learning for ControlApril-15-09

6:35 PM

artificial intelligence comp-424atombe2/classnotes... · 2009-04-16 · artificial intelligence...

Documents

system-on-chip design flow -...

comp 590: artificial intelligence

comp 406 lecture 01 artificial intelligence

blue – comp red - ext. blue – comp red - ext blue –...

comp 14112 speech notes - university of manchester · 1...

synthetic biology i the programming of cells nawwaf kharma &...

philanthropy.iupui.edu · tomberg family philanthropies...

exertional heat illness - medicine.utah.edu · exertional...

soft artificial life, artificial agents and artificial ......

comp 4431 lecture 02 artificial intelligenceinformed search...

comp gov politics sf teach comp meth07

comp 590: artificial intelligence. today course overview...

index [] · index p 02—09 comp. 175 p 10—19 comp. 176 p...

valentin tomberg and anthroposophy - steiner - … · web...

academic advancement network · homework and example...

valentin tomberg - clube do tarô - tarot

comp 307: artificial intelligence lecture 3:1 comp307...

introduction to artificial intelligence comp 30030 dr. robin...

artificial intelligence demystified 06 of 09...artificial...

unbenannt-1 - audium · test report comp comp 3 , ! , comp...