cs 59000 statistical machine learning lecture 15

28

CS 59000 Statistical Machine learning Lecture 15 Yuan (Alan) Qi Purdue CS Oct. 21 2008

Upload: vadin

Post on 06-Jan-2016

36 views

Category:

Documents

0 download

Report

Download

Tags:

Embed Size (px):

DESCRIPTION

CS 59000 Statistical Machine learning Lecture 15. Yuan (Alan) Qi Purdue CS Oct. 21 2008. Outline. Review of Gaussian Processes (GPs) From linear regression to GP GP for regression Learning hyperparameters Automatic Relevance Determination GP for classification. Gaussian Processes. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: CS 59000 Statistical Machine learning Lecture 15

CS 59000 Statistical Machine learningLecture 15

Yuan (Alan) QiPurdue CS

Oct. 21 2008

Page 2: CS 59000 Statistical Machine learning Lecture 15

Outline

• Review of Gaussian Processes (GPs)• From linear regression to GP • GP for regression

• Learning hyperparameters• Automatic Relevance Determination• GP for classification

Page 3: CS 59000 Statistical Machine learning Lecture 15

Gaussian Processes

How kernels arise naturally in a Bayesian setting?

Instead of assigning a prior on parameters w, we assign a prior on function value y.Infinite space in theory

Finite space in practice (finite number of training set and test set)

Page 4: CS 59000 Statistical Machine learning Lecture 15

Linear Regression Revisited

Let

We have

Page 5: CS 59000 Statistical Machine learning Lecture 15

From Prior on Parameter to Prior on Function

The prior on function value:

Page 6: CS 59000 Statistical Machine learning Lecture 15

Stochastic Process

A stochastic process is specified by giving the joint distribution for any finite set of values in a consistent manner (Loosely speaking, it means that a marginalized joint distribution is the same as the joint distribution that is defined in the subspace.)

Page 7: CS 59000 Statistical Machine learning Lecture 15

Gaussian Processes

The joint distribution of any variables is a multivariable Gaussian distribution.

Without any prior knowledge, we often set mean to be 0. Then the GP is specified by the covariance :

Page 8: CS 59000 Statistical Machine learning Lecture 15

Impact of Kernel FunctionCovariance matrix : kernel function

Application economics & finance

Page 9: CS 59000 Statistical Machine learning Lecture 15

Gaussian Process for Regression

Likelihood:

Prior:

Marginal distribution:

Page 10: CS 59000 Statistical Machine learning Lecture 15

Samples of Data Points

Page 11: CS 59000 Statistical Machine learning Lecture 15

Predictive Distribution

is a Gaussian distribution with mean and variance:

Page 12: CS 59000 Statistical Machine learning Lecture 15

Predictive Mean

is the nth component ofWe see the same form as kernel ridge

regression and kernel PCA.

Page 13: CS 59000 Statistical Machine learning Lecture 15

GP Regression

Discussion: the difference between GP regression and Bayesian regression with Gaussian basis functions?

Page 14: CS 59000 Statistical Machine learning Lecture 15

Computational Complexity

GP prediction for a new data point:

GP: O(N3) where N is number of data pointsBasis function model: O(M3) where M is the

dimension of the feature expansionWhen N is large: computationally expensive.Sparsification: make prediction based on only a few

data points (essentially make N small)

Page 15: CS 59000 Statistical Machine learning Lecture 15

Learning Hyperparameters

Empirical Bayes Methods

Page 16: CS 59000 Statistical Machine learning Lecture 15

Automatic Relevance Determination

Consider two-dimensional problems:

Maximizing the marginal likelihood will make certain small, reducing its relevance to prediction.

Page 17: CS 59000 Statistical Machine learning Lecture 15

Example

t = sin(2 π x1)

x2 = x1 +n

x3 = e

Page 18: CS 59000 Statistical Machine learning Lecture 15

Gaussian Processes for Classification

Likelihood:

GP Prior:

Covariance function:

Page 19: CS 59000 Statistical Machine learning Lecture 15

Sample from GP Prior

Page 20: CS 59000 Statistical Machine learning Lecture 15

Predictive Distribution

No analytical solution.Approximate this integration:

Laplace’s methodVariational BayesExpectation propagation

Page 21: CS 59000 Statistical Machine learning Lecture 15

Laplace’s method for GP Classification (1)

Page 22: CS 59000 Statistical Machine learning Lecture 15

Laplace’s method for GP Classification (2)

Taylor expansion:

Page 23: CS 59000 Statistical Machine learning Lecture 15

Laplace’s method for GP Classification (3)

Newton-Raphson update:

Page 24: CS 59000 Statistical Machine learning Lecture 15

Laplace’s method for GP Classification (4)

Gaussian approximation:

Page 25: CS 59000 Statistical Machine learning Lecture 15

Laplace’s method for GP Classification (4)

Question: How to get the mean and the variance above?

Page 26: CS 59000 Statistical Machine learning Lecture 15

Predictive Distribution

Page 27: CS 59000 Statistical Machine learning Lecture 15

Example

Page 28: CS 59000 Statistical Machine learning Lecture 15

59000 Double Pump Drive - deere.com · Funk™ Series 59000 59000 Double Pump Drive Ratings Max input power Max output power (per pump pad) Max input torque Max output torque Max

Approximating Semantics Ming Li University of Waterloo July 28, 2015, CS 886, Topics in Statistical NLP

CS 59000 Statistical Machine learning Lecture 12 Yuan (Alan) Qi Purdue CS Oct. 7 2008

CS 59000 Statistical Machine learning Lecture 18

180 Av Gaston Berger 59000 LILLE Tél : 03.20.52.59.91 ... · Ligue Hauts-de-France du Sport Universitaire 180 Av Gaston Berger – 59000 LILLE – Tél : 03.20.52.59.91 [email protected]

CS 595-052 Machine Learning and Statistical Natural Language Processing

Generic Statistical Business Process Model (GSBPM)sites.nationalacademies.org/cs/groups/dbassesite/... · Generic Statistical Business Process Model ... Generic Statistical Business

CS 59000 Statistical Machine learning Lecture 13 Yuan (Alan) Qi Purdue CS Oct. 8 2008

1 The R Project for statistical computing Eric Fouh, Christopher Poirel CS 5604 Fall 2010

CS 188: Artificial Intelligence Learning III: Statistical learning Instructor: Stuart Russell--- University of California, Berkeley

Statistical Graphs & Charts · 2017. 8. 30. · Statistical Graphs & Charts CS 4460 –Intro. to Information Visualization August 30, 2017 John Stasko ... When analyzing values that

STRAPS TIE DOWNS 59000

CS 416 Artificial Intelligence Lecture 22 Statistical Learning Chapter 20.5 Lecture 22 Statistical Learning Chapter 20.5

CS b351 Statistical Learning

CS 59000 Statistical Machine learning Lecture 24

CS 294-5: Statistical Natural Language Processingcs188/fa18/assets/slides/lec12/FA18… · CS 188: Artificial Intelligence Probability Instructors: Dan Klein and Pieter Abbeel ---

CS 59000 Statistical Machine learning Lecture 18 Yuan (Alan) Qi Purdue CS Oct. 30 2008

CS 59000 Statistical Machine learning Lecture 3

Supplemental Methods, Figures and Tablesdm5migu4zj3pb.cloudfront.net/manuscripts/59000/59255/JCI59255sd.pdfSupplemental Methods, Figures and Tables ... Huzefa Ratlamwala1,2, Alexander

Dec 2015 Mid-e e e - Amazon Web Servicesdairyagendatoday.s3.amazonaws.com/public/59000/59000.pdf · 2015-12-21 · Please send your resume and references to Human Resources, Holstein

WORKING PAPER SERIES - IÉSEG · 2020. 4. 27. · Raluca PARVULESCU IESEG School of Management, LEM-UMR 9221 - LEM, F-59000 Lille, France 3, rue de la Digue, 59000 Lille, France corresponding

Multiﬁeld Visualization Using Local Statistical eld Visualization Using Local Statistical Complexity Heike Janicke, Alexander Wiebel, Gerik Scheuermann,¨ Member, IEEE CS, and Wolfgang

Practical statistical network analysis (with R and igraphstevel/504/igraph.pdf · Practical statistical network analysis (with R and igraph) G´abor Cs´ardi [email protected] Department

INSTRUCTIONS FOR FINANCIAL AND STATISTICAL REPORT – FORM ...dhs.pa.gov/cs/groups/webcontent/documents/webcopy/p_013222.pdf · 1 MA-11 12/2009 INSTRUCTIONS FOR . FINANCIAL AND STATISTICAL

CS 542 Statistical Reinforcement Learning

Statistical Models of Conditioningpapers.nips.cc/paper/1463-statistical-models-of-conditioning.pdf · The CS-processing models do not construct a net prediction of the reward, or

CS 59000 Statistical Machine learning Lecture 10 Yuan (Alan) Qi Purdue CS Sept. 25 2008

1 1 CS 388: Natural Language Processing: Statistical Parsing Raymond J. Mooney University of Texas at Austin

Speech Recognition: Statistical Methods - CS Course Webpages

CS 562: STATISTICAL NATURAL LANGUAGE PROCESSING

CS 59000-MSC: Mobile Sensing, Computing, & Applications · nition, CPU-ofﬂoading, and sensor data analytics. The course will end with a system/app that students develop and demonstrate

CS-570 Statistical Signal Processing

59000 Triple Pump Drive - John Deere · Funk™ Series 59000 59000 Triple Pump Drive Ratings Max input power Max output power (per pump pad) Max input torque Max output torque Max

· 3rd year 4th year 2 year' year year year year Annual Course fee 54000 59000 59000 59000 54000 59000 59000 59000 59000 64000 64000 64000 54000 59000 59000 59000 33500 30500 40000

Richard Wu - Machine Learning: Statistical and ...richardwu.ca CS 485/685 Course Notes Machine Learning: Statistical and Computational Foundations Shai Ben-David Winter 2019 University