gaussian processes nando de freitas university of british columbia june 2010 texpoint fonts used in...

Gaussian Processes Nando de Freitas University of British Columbia June 2010

Upload: leon-todd

Post on 28-Dec-2015

212 views

Category:

Documents

0 download

Report

Download

Tags:

Embed Size (px):

TRANSCRIPT

Gaussian Processes

Nando de FreitasUniversity of British ColumbiaJune 2010

GP resources• Wikipedia is an excellent resource for the matrix inversion lemma,

aka Sherman–Morrison–Woodbury formula or just Woodbury matrix identity.

• “Gaussian processes in machine learning” by Carl Edward Rasmussen is a nice brief introductory tutorial with Matlab code.

• Ed Snelson’s PhD thesis is an excellent resource on Gaussian processes and I will use some of its introductory material.

• The book of Chris Williams and Carl Rasmussen is the ultimate resource.

• Other good resources: Alex Smola, Zoubin Ghahramani, …

Page 3: Gaussian Processes Nando de Freitas University of British Columbia June 2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this

Learning and Bayesian inference

hphdp

hphdpdhp

)()|(

)()|()|(

Likelihood

Prior of “sheep” class

Posterior

“sheep”

Page 4: Gaussian Processes Nando de Freitas University of British Columbia June 2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this

Nonlinear regression

Page 5: Gaussian Processes Nando de Freitas University of British Columbia June 2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this

Sampling from P(f)N = 5; % The number of training points.sigma = 0.1 % Noise variance.h = 1; % Kernel parameter. % Randomly generate training points on [-5,5].X = -5 + 10*rand(N,1);x = (-5:0.1:5)'; % The test points.n = size(x,1); % The number of test points.

% Construct the mean and covariance functions.m = inline('0.25*x.^2', 'x'); % another example: m = inline('sin(0.9*x)','x');K = inline(['exp((-1/(h^2))*(repmat(transpose(p),size(q)) - repmat(q,size(transpose(p)))).^2)'], 'p', 'q','h'); % Demonstrate how to sample functions from the prior:L = 5; % Number of functions sampled from the prior P(f)f = zeros(n,L);for i=1:L, f(:,i) = m(x) + sqrtm(K(x,x,h)+sigma*eye(n))'*randn(n,1);end;plot(x,f,'linewidth',2);

Page 6: Gaussian Processes Nando de Freitas University of British Columbia June 2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this

[Snelson, 2007]

Page 7: Gaussian Processes Nando de Freitas University of British Columbia June 2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this

Active learning with GPs

Page 8: Gaussian Processes Nando de Freitas University of British Columbia June 2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this

CPSC 340 8

Expected improvement

Actual unknown function

GP function approximation Next evaluation

Data points

Page 9: Gaussian Processes Nando de Freitas University of British Columbia June 2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this

CPSC 340 9

cte

Imp

rov

eme

Parameter

Cos

t fun

ctio

Page 10: Gaussian Processes Nando de Freitas University of British Columbia June 2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this

2D sensor placement application by Andreas Krause and Carlos Guestrin

Page 11: Gaussian Processes Nando de Freitas University of British Columbia June 2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this

CPSC 340 11

Sensor network schedulingAutomatically schedule sensors to obtain the best understanding of the environment while minimizing resource expenditure (power, bandwidth, need for human intervention)?

Page 12: Gaussian Processes Nando de Freitas University of British Columbia June 2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this

GP regression

Gaussian noise / likelihood

Zero-mean GP prior

The marginal likelihood (evidence) is Gaussian:

Page 13: Gaussian Processes Nando de Freitas University of British Columbia June 2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this

Proof:

Page 14: Gaussian Processes Nando de Freitas University of British Columbia June 2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this

Proof:

Page 15: Gaussian Processes Nando de Freitas University of British Columbia June 2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this

Proof:

Page 16: Gaussian Processes Nando de Freitas University of British Columbia June 2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this

Proof:

Page 17: Gaussian Processes Nando de Freitas University of British Columbia June 2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this

GP regression

Both sets are, by definition, jointly Gaussian:

Train set

Test set

The joint distribution of the measurements is:

Page 18: Gaussian Processes Nando de Freitas University of British Columbia June 2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this

GP regression

The predictive conditional distribution is Gaussian too:

Page 19: Gaussian Processes Nando de Freitas University of British Columbia June 2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this

Proof sketch:

Page 20: Gaussian Processes Nando de Freitas University of British Columbia June 2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this

Proof sketch:

Page 21: Gaussian Processes Nando de Freitas University of British Columbia June 2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this

Proof sketch:

Page 22: Gaussian Processes Nando de Freitas University of British Columbia June 2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this

Proof sketch:

Page 23: Gaussian Processes Nando de Freitas University of British Columbia June 2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this

GP regression

% Generate random training labels.F = m(X) + chol(K(X,X,h)+sigma*eye(N))'*randn(N,1);M = m(X); % COMPUTE POSTERIOR MEAN AND VARIANCES = eye(N)*sigma + K(X,X,h);y = m(x) + K(X,x,h)*inv(S)*(F - M); y = y';for i = 1:n xi = x(i); c(i) = sigma + K(xi,xi,h) - K(X,xi,h)*inv(S)*K(xi,X,h);end % Plot the mean and 95% confidence intervals.plot(x,y-2*c,'g-','linewidth',3)plot(x,y+2*c,'g-','linewidth',3)plot(x,y,'r','linewidth',3)plot(X,F,'bx','linewidth',15)