artificial intelligence comp-424atombe2/classnotes... · 2009-04-16 · artificial intelligence...

Post on 05-Jun-2020

4 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Artificial IntelligenceCOMP-424

Lecture notes by Alexandre Tomberg

Prof. Joelle Pineau McGill University

Winter 2009

Lecture notes Page 1

History of AII.

Uninformed Search Methods1.Informed Search2.Search for Optimization Problems3.Game Playing4.Constraint Satisfaction5.

SearchII.

Knowledge Representation: Logic1.First Order Logic2.Planning3.Spatial Planning4.

LogicIII.

Reasoning under Uncertainty1.Bayesian Networks2.

ProbabilityIV.

Machine Learning: Parameter Estimation1.Learning with Missing Values2.Supervised Learning3.Neural Nets4.Decision Trees5.

Machine LearningV.

Utility Theory1.Markov Decision Processes (MDPs)2.Reinforcement Learning3.

Decision TheoryVI.

Table of ContentsDecember-03-08

12:16 PM

Lecture notes Page 2

History of AIJanuary-06-09

10:03 AM

Lecture notes Page 3

Uninformed Search MethodsJanuary-08-09

10:06 AM

Lecture notes Page 4

Lecture notes Page 5

Generic Search Algorithm:

Algorithm 1: BFS

Lecture notes Page 6

Algorithm 2: DFS

Algorithm 3: Depth limited search

Algorithm 4: Iterative Deepening

Lecture notes Page 7

Informed SearchJanuary-13-09

10:02 AM

Lecture notes Page 8

Algorithm #1: Best-First Search

Algorithm #2: Heuristic Search

AlgorithmsJanuary-13-09

10:34 AM

Lecture notes Page 9

Algorithm # 3: A* search

Lecture notes Page 10

Lecture notes Page 11

Search for Optimization ProblemsJanuary-15-09

10:05 AM

Lecture notes Page 12

Algorithm #1: Hill Climbing

Algorithm #2: Simulated Annealing

Iterative Improvement AlgorithmsJanuary-15-09

10:05 AM

Lecture notes Page 13

Lecture notes Page 14

Genetic AlgorithmsJanuary-15-09

11:06 AM

Lecture notes Page 15

Game PlayingJanuary-20-09

10:03 AM

Lecture notes Page 16

Minimax SearchJanuary-20-09

10:07 AM

Lecture notes Page 17

α-β PruningJanuary-20-09

10:44 AM

Lecture notes Page 18

Constraint SatisfactionJanuary-22-09

10:10 AM

Lecture notes Page 19

Lecture notes Page 20

Lecture notes Page 21

Knowledge Representation: LogicJanuary-27-0910:10 AM

Lecture notes Page 22

Lecture notes Page 23

Lecture notes Page 24

Lecture notes Page 25

Lecture notes Page 26

Lecture notes Page 27

First Order LogicFebruary-18-09

7:50 PM

Lecture notes Page 28

Lecture notes Page 29

Lecture notes Page 30

Lecture notes Page 31

Lecture notes Page 32

Lecture notes Page 33

Lecture notes Page 34

PlanningFebruary-03-0910:11 AM

Lecture notes Page 35

Lecture notes Page 36

Lecture notes Page 37

Lecture notes Page 38

Partial Order Planning AlgorithmFebruary-18-09

8:55 PM

Lecture notes Page 39

Least Commitment

Analysis

Lecture notes Page 40

Spatial PlanningFebruary-03-09

10:32 AM

Lecture notes Page 41

Lecture notes Page 42

Lecture notes Page 43

Lecture notes Page 44

If we know probabilities, what actions should we choose?

Reasoning under UncertaintyFebruary-18-09

9:13 PM

Lecture notes Page 45

Lecture notes Page 46

Lecture notes Page 47

Lecture notes Page 48

Lecture notes Page 49

Lecture notes Page 50

Bayesian NetworksMarch-19-09

3:26 PM

Lecture notes Page 51

Lecture notes Page 52

Machine Learning: Parameter EstimationMarch-03-09

10:09 AM

Lecture notes Page 53

Statistical Parameter FittingMarch-03-09

10:34 AM

Lecture notes Page 54

Maximum Likelihood Estimate (MLE)March-03-09

10:53 AM

Lecture notes Page 55

Lecture notes Page 56

Learning with Missing ValuesMarch-10-09

10:14 AM

Lecture notes Page 57

Basic EM algorithm:Start with an initial parameter setting

Expectation Step: Complete the data by assigning values to missing items.Maximization Step: Compute the maximum log-likelihood and new parameters on the complete data.

Repeat:

Lecture notes Page 58

Lecture notes Page 59

Soft EM for a general Bayes net:

Lecture notes Page 60

Machine Learning: ClusteringMarch-19-09

4:21 PM

Lecture notes Page 61

Lecture notes Page 62

Supervised LearningMarch-10-09

10:55 AM

Lecture notes Page 63

Lecture notes Page 64

Lecture notes Page 65

OverfittingApril-14-09

8:35 PM

Lecture notes Page 66

Lecture notes Page 67

Gradient Descent:Given w0, for i = 0, 1, 2, ... do:

Repeat until necessary.

Finding Parameters in GeneralApril-14-09

9:05 PM

Lecture notes Page 68

Batch vs. Online OptimizationApril-14-09

9:38 PM

Lecture notes Page 69

What we should know:

Lecture notes Page 70

Neural NetsMarch-19-09

4:48 PM

Lecture notes Page 71

Lecture notes Page 72

Lecture notes Page 73

Lecture notes Page 74

Forward pass:

Compute the output of all units in layer kCopy this output as the input to the next layer

for layer k = 1 ... K do:

Feed Forward Neural NetworksApril-15-09

10:48 AM

Lecture notes Page 75

Lecture notes Page 76

Backpropagation algorithm:Forward pass: compute the output of the network going from input layer to output layer.

1.

Backward pass: compute the gradient of the error for every weight inside the network going from output layer towards the input layer.

2.

Update: update the weights using the standard rule:3.

Lecture notes Page 77

Lecture notes Page 78

Overfitting in Neural NetApril-15-09

12:56 PM

Lecture notes Page 79

Decision TreesApril-15-091:04 PM

Lecture notes Page 80

Lecture notes Page 81

Lecture notes Page 82

Lecture notes Page 83

Lecture notes Page 84

Utility TheoryApril-15-09

1:54 PM

Lecture notes Page 85

Utility Models:

Lecture notes Page 86

Maximizing Expected Utility (MEU) PrincipleApril-15-09

2:21 PM

Lecture notes Page 87

Lecture notes Page 88

What we should know:

Lecture notes Page 89

Markov Decision Processes (MDPs)April-15-09

2:50 PM

Lecture notes Page 90

Lecture notes Page 91

PoliciesApril-15-09

2:50 PM

Lecture notes Page 92

Lecture notes Page 93

Iterative Policy Evaluation Algorithm:Start with some initial guess1.During iteration k update the function for all states as follows:2.

Lecture notes Page 94

Searching for a Good PolicyApril-15-09

4:47 PM

Lecture notes Page 95

Policy Iteration Algorithm:Start with an initial policy

Compute using policy evaluation algorithmCompute using greedy policy update rule on

Repeat until

Lecture notes Page 96

Value Iteration Algorithm:Start with an initial value

Update the value function estimate using:Repeat until

Lecture notes Page 97

Lecture notes Page 98

Lecture notes Page 99

Reinforcement LearningApril-15-09

5:38 PM

Lecture notes Page 100

Lecture notes Page 101

TD (order 0) Learning Algorithm:Initialize the value function:1.

Pick a start statea.

Choose an action a based on current policy π and current state si.Take action a, observe reward r and new state s'ii.Compute TD error: δ = r + γ V(s') - V(s)iii.Update the value function: V(s) = V(s) + αs δiv.Update current state: s = s'v.If s' is a terminal state, GoTo 2.vi.

Repeat for every time step tb.

Repeat until feeling sick of it:2.

Lecture notes Page 102

Reinforcement Learning for ControlApril-15-09

6:35 PM

Lecture notes Page 103

Lecture notes Page 104

top related