ck0146 or tip8311: homework03 - github pages · 2020. 4. 24. · ck0146or tip8311: homework03...

2
CK0146 or TIP8311: Homework 03 Exercise 03.01. You are given a dataset D (L) (here) consisting of input and output pairs of observations {(x (L) n ,t (L) n )} N learning n=1 , such that x (L) n R D with D = 4, and t (L) n R. For D (L) , set aside a training set (let N training = 2 3 N learning ) and a validation set (N validation = 2 3 N learning ). You are requested to build the linear basis function model for regressing the output t onto the space of some j =1,...,M fixed basis functions φ j (x) of the inputs x =(x 1 ,...,x D ) T . Assume that the target variable t is given by a deterministic function y (x, w) with additive Gaussian noise ε ∼N (0-1 ), so that t = y (x, w)+ ε with t ∼N (y (x, w)-1 ), and let y (x, w)= w T φ(x) be the regression model, with parameters w =(w 0 ,...,w M-1 ) T and φ =(φ 0 ,...,φ M-1 ) T with φ 0 (x) = 1. Also, assume that the output observations {t n } are drawn independently from distribution N (y (x, w)-1 ) with β unknown. Determine an optimal set of model parameters w under the two following modelling setups: A: For a varying number j =1, 2,...,M max of (fixed) basis, you must estimate the model parameters using the training set (to get, say, ˆ w training (j )) and assess the perfor- mance of the resulting model on the validation set (to get, say, E(w * training (j ))). Based on the validation results, you must select the optimal number ˆ M of basis functions (as the one that corresponds to the smallest error) and estimate the parameters of the final model (that is, ˆ w training+validation ( ˆ M )) on the rejoined training and validation sets. B: For a fixed number of (fixed) basis functions (for instance, ˆ M from above), you must regularise the error function by penalising the L 2 norm of the parameter vector. For a varying regularisation parameter λ (λ min max ) ⊆{0, R + }, you must estimate the model parameters using the training set (to get, say, ˆ w training (λ)) and assess the perfor- mance of the resulting model on the validation set (to get, say, E(w * training (λ))). Based on the validation results, you must select the value ˆ λ of the regularisation parameter (as the one that corresponds to the smallest error) and estimate the parameters of the final model (that is, ˆ w training+validation ( ˆ λ)) on the rejoined training and validation sets. Once the models have been defined ( ˆ w training+validation ( ˆ M ) and ˆ w training+validation ( ˆ λ) have been determined), you must asses their performance on the independent test set D (T ) (here). [Note ] To assess the model performances, you must use the root-mean-square (RMS) error, at all stages (training, validation, test). You are free to choose your favourite family of basis functions (that is, Gaussians, sigmoids, etc.), its internal parameters and its arrangement. 1

Upload: others

Post on 04-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CK0146 or TIP8311: Homework03 - GitHub Pages · 2020. 4. 24. · CK0146or TIP8311: Homework03 Exercise 03.01. You are given a dataset D(L) (here) consisting of input and output pairs

CK0146 or TIP8311: Homework 03

Exercise 03.01. You are given a dataset D(L) (here) consisting of input and output pairs

of observations {(x(L)n , t

(L)n )}

Nlearning

n=1 , such that x(L)n ∈ R

D with D = 4, and t(L)n ∈ R. For D(L),

set aside a training set (let Ntraining =2

3Nlearning) and a validation set (Nvalidation =

2

3Nlearning).

You are requested to build the linear basis function model for regressing the output t ontothe space of some j = 1, . . . ,M fixed basis functions φj(x) of the inputs x = (x1, . . . , xD)

T .

Assume that the target variable t is given by a deterministic function y(x,w) with additiveGaussian noise ε ∼ N (0, β−1), so that t = y(x,w) + ε with t ∼ N (y(x,w), β−1), and lety(x,w) = w

Tφ(x) be the regression model, with parameters w = (w0, . . . , wM−1)T and

φ = (φ0, . . . , φM−1)T with φ0(x) = 1. Also, assume that the output observations {tn} are

drawn independently from distribution N (y(x,w), β−1) with β unknown.

Determine an optimal set of model parameters w under the two following modelling setups:

• A: For a varying number j = 1, 2, . . . ,Mmax of (fixed) basis, you must estimate themodel parameters using the training set (to get, say, wtraining(j)) and assess the perfor-mance of the resulting model on the validation set (to get, say, E(w∗

training(j))). Based

on the validation results, you must select the optimal number M of basis functions(as the one that corresponds to the smallest error) and estimate the parameters of thefinal model (that is, wtraining+validation(M)) on the rejoined training and validation sets.

• B: For a fixed number of (fixed) basis functions (for instance, M from above), you mustregularise the error function by penalising the L2 norm of the parameter vector. Fora varying regularisation parameter λ ∈ (λmin, λmax) ⊆ {0,R+}, you must estimate themodel parameters using the training set (to get, say, wtraining(λ)) and assess the perfor-mance of the resulting model on the validation set (to get, say, E(w∗

training(λ))). Based

on the validation results, you must select the value λ of the regularisation parameter(as the one that corresponds to the smallest error) and estimate the parameters of thefinal model (that is, wtraining+validation(λ)) on the rejoined training and validation sets.

Once the models have been defined (wtraining+validation(M) and wtraining+validation(λ) have beendetermined), you must asses their performance on the independent test set D(T ) (here).

[Note] To assess the model performances, you must use the root-mean-square (RMS) error,at all stages (training, validation, test). You are free to choose your favourite family of basisfunctions (that is, Gaussians, sigmoids, etc.), its internal parameters and its arrangement.

1

Page 2: CK0146 or TIP8311: Homework03 - GitHub Pages · 2020. 4. 24. · CK0146or TIP8311: Homework03 Exercise 03.01. You are given a dataset D(L) (here) consisting of input and output pairs

0.20.40.60.8

0.2

0.4

0.6

0.8

-1

-0.5

0

0.5

1

0.20.40.60.8

0.2

0.4

0.6

0.8

-1

-0.5

0

0.5

1

0.20.40.60.8

0.2

0.4

0.6

0.8

-1

-0.5

0

0.5

1

0.20.40.60.8

0.2

0.4

0.6

0.8

-1

-0.5

0

0.5

1

0.20.40.60.8

0.2

0.4

0.6

0.8

-1

-0.5

0

0.5

1

0.20.40.60.8

0.2

0.4

0.6

0.8

-1

-0.5

0

0.5

1

0.20.40.60.8

0.2

0.4

0.6

0.8

-1

-0.5

0

0.5

1

0.20.40.60.8

0.2

0.4

0.6

0.8

-1

-0.5

0

0.5

1

0.20.40.60.8

0.2

0.4

0.6

0.8

-1

-0.5

0

0.5

1

0.20.40.60.8

0.2

0.4

0.6

0.8

-1

-0.5

0

0.5

1

0.20.40.60.8

0.2

0.4

0.6

0.8

-1

-0.5

0

0.5

1

0.20.40.60.8

0.2

0.4

0.6

0.8

-1

-0.5

0

0.5

1

Figure 1: Pairwise scatterplots of the 4D input vectors x = (x1, . . . , x4)T , dyed using the corresponding value of the outputs t.

2