coms21202: symbols, patterns and signals deterministic
TRANSCRIPT
![Page 1: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/1.jpg)
COMS21202: Symbols, Patterns and SignalsDeterministic Data Models
Dima [email protected]
Bristol University, Department of Computer ScienceBristol BS8 1UB, UK
February 5, 2019
Dima [email protected]
COMS21202: Data Acquisition
![Page 4: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/4.jpg)
Data Modelling
I Models are descriptions of the dataI They encode our assumptions about the dataI Enabling us to:
I design ‘optimal’ algorithmsI compare and contrast methodsI quantify performance
I A model is ‘more than’ the data - a ‘generalisation’ of the data
Dima [email protected]
COMS21202: Data Acquisition
![Page 5: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/5.jpg)
Data Modelling
e.g. build a model of Messi as he rolls the ball across the pitch
Data: collect data of body joints during action from multiple examplesModel: ?
Dima [email protected]
COMS21202: Data Acquisition
![Page 6: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/6.jpg)
Data Modelling
e.g. build a model of Messi as he rolls the ball across the pitch
Data: collect data of body joints during action from multiple examples
Model: ?
Dima [email protected]
COMS21202: Data Acquisition
![Page 7: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/7.jpg)
Data Modelling
e.g. build a model of Messi as he rolls the ball across the pitch
Data: collect data of body joints during action from multiple examplesModel: ?
Dima [email protected]
COMS21202: Data Acquisition
![Page 8: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/8.jpg)
Data Modelling
I No need to play GodI Models do not have to exactly describe the ‘real world’, nor correctly
model how data was generated
I In some cases, we may approximate an underlying physical processas part of our model
I In others, this may be impossible and/or impracticalI Models only need to enable us to define a method to tackle the task at
handI Performance of the method then depends on how well the model
‘maps’ the data onto the required solutionI choice of model is often dictated by practicality of method, as well as
by our assumptions about the data
Dima [email protected]
COMS21202: Data Acquisition
![Page 9: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/9.jpg)
Data Modelling
I No need to play GodI Models do not have to exactly describe the ‘real world’, nor correctly
model how data was generatedI In some cases, we may approximate an underlying physical process
as part of our modelI In others, this may be impossible and/or impractical
I Models only need to enable us to define a method to tackle the task athand
I Performance of the method then depends on how well the model‘maps’ the data onto the required solution
I choice of model is often dictated by practicality of method, as well asby our assumptions about the data
Dima [email protected]
COMS21202: Data Acquisition
![Page 10: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/10.jpg)
Data Modelling
I No need to play GodI Models do not have to exactly describe the ‘real world’, nor correctly
model how data was generatedI In some cases, we may approximate an underlying physical process
as part of our modelI In others, this may be impossible and/or impracticalI Models only need to enable us to define a method to tackle the task at
handI Performance of the method then depends on how well the model
‘maps’ the data onto the required solutionI choice of model is often dictated by practicality of method, as well as
by our assumptions about the data
Dima [email protected]
COMS21202: Data Acquisition
![Page 11: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/11.jpg)
Fish Again,
I When classifying, we wish to find the model that achieves maximumdiscrimination
I Model selected here is a linear classifier
Dima [email protected]
COMS21202: Data Acquisition
![Page 12: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/12.jpg)
Fish Again,
I When classifying, we wish to find the model that achieves maximumdiscrimination
I Model selected here is a linear classifier
Dima [email protected]
COMS21202: Data Acquisition
![Page 13: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/13.jpg)
Model Parameters
I Models are defined in terms of parameters (one or more)
I These may be empirically obtained e.g. by trial and errorI or from training data by tuning or training the model
one parameter needed x = t two parameters neededy = mx + c
Dima [email protected]
COMS21202: Data Acquisition
![Page 14: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/14.jpg)
Model Parameters
I Models are defined in terms of parameters (one or more)
I These may be empirically obtained e.g. by trial and errorI or from training data by tuning or training the model
one parameter needed x = t
two parameters neededy = mx + c
Dima [email protected]
COMS21202: Data Acquisition
![Page 15: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/15.jpg)
Model Parameters
I Models are defined in terms of parameters (one or more)
I These may be empirically obtained e.g. by trial and errorI or from training data by tuning or training the model
one parameter needed x = t two parameters neededy = mx + c
Dima [email protected]
COMS21202: Data Acquisition
![Page 16: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/16.jpg)
Model Parameters
I Models are defined in terms of parameters (one or more)I These may be empirically obtained e.g. by trial and error
I or from training data by tuning or training the model
one parameter needed x = t two parameters neededy = mx + c
Dima [email protected]
COMS21202: Data Acquisition
![Page 17: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/17.jpg)
Model Parameters
I Models are defined in terms of parameters (one or more)I These may be empirically obtained e.g. by trial and errorI or from training data by tuning or training the model
one parameter needed x = t two parameters neededy = mx + c
Dima [email protected]
COMS21202: Data Acquisition
![Page 18: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/18.jpg)
Generalisation vs. Overfitting
I Generalisation is the probably the most fundamental concept inmachine learning.
I We do not really care about performance on training data - we alreadyknow the answer
I We care about whether we can take a decision on future dataI A good performance on training data is only a means to an end, not a
goal in itselfI In fact trying too hard on training data leads to a damaging
phenomenon called overfitting
Dima [email protected]
COMS21202: Data Acquisition
![Page 19: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/19.jpg)
Generalisation vs. Overfitting
I Generalisation is the probably the most fundamental concept inmachine learning.
I We do not really care about performance on training data - we alreadyknow the answer
I We care about whether we can take a decision on future dataI A good performance on training data is only a means to an end, not a
goal in itselfI In fact trying too hard on training data leads to a damaging
phenomenon called overfitting
Dima [email protected]
COMS21202: Data Acquisition
![Page 20: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/20.jpg)
Generalisation vs. Overfitting
I Generalisation is the probably the most fundamental concept inmachine learning.
I We do not really care about performance on training data - we alreadyknow the answer
I We care about whether we can take a decision on future data
I A good performance on training data is only a means to an end, not agoal in itself
I In fact trying too hard on training data leads to a damagingphenomenon called overfitting
Dima [email protected]
COMS21202: Data Acquisition
![Page 21: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/21.jpg)
Generalisation vs. Overfitting
I Generalisation is the probably the most fundamental concept inmachine learning.
I We do not really care about performance on training data - we alreadyknow the answer
I We care about whether we can take a decision on future dataI A good performance on training data is only a means to an end, not a
goal in itself
I In fact trying too hard on training data leads to a damagingphenomenon called overfitting
Dima [email protected]
COMS21202: Data Acquisition
![Page 22: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/22.jpg)
Generalisation vs. Overfitting
I Generalisation is the probably the most fundamental concept inmachine learning.
I We do not really care about performance on training data - we alreadyknow the answer
I We care about whether we can take a decision on future dataI A good performance on training data is only a means to an end, not a
goal in itselfI In fact trying too hard on training data leads to a damaging
phenomenon called overfitting
Dima [email protected]
COMS21202: Data Acquisition
![Page 23: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/23.jpg)
Generalisation vs. Overfitting
Example
Imagine you are trying to prepare for Symbols, Patterns and Signals examthis June.
You have access to previous exam papers and their workedanswers available online. You begin by trying to answer the previouspapers and comparing your answers with the model answers provided.Unfortunately, you get carried away and spend all your time on memorisingthe model answers to all past papers. Now if the upcoming examcompletely consists of past questions, you are certainly to do well. But ifthe new exam asks different questions, you would be ill-prepared. In thiscase, you are overfitting the past exam papers and the knowledge yougained did not generalise to future exam questions.
source: Flach (2012), Machine Learning
Dima [email protected]
COMS21202: Data Acquisition
![Page 24: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/24.jpg)
Generalisation vs. Overfitting
Example
Imagine you are trying to prepare for Symbols, Patterns and Signals examthis June. You have access to previous exam papers and their workedanswers available online. You begin by trying to answer the previouspapers and comparing your answers with the model answers provided.
Unfortunately, you get carried away and spend all your time on memorisingthe model answers to all past papers. Now if the upcoming examcompletely consists of past questions, you are certainly to do well. But ifthe new exam asks different questions, you would be ill-prepared. In thiscase, you are overfitting the past exam papers and the knowledge yougained did not generalise to future exam questions.
source: Flach (2012), Machine Learning
Dima [email protected]
COMS21202: Data Acquisition
![Page 25: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/25.jpg)
Generalisation vs. Overfitting
Example
Imagine you are trying to prepare for Symbols, Patterns and Signals examthis June. You have access to previous exam papers and their workedanswers available online. You begin by trying to answer the previouspapers and comparing your answers with the model answers provided.Unfortunately, you get carried away and spend all your time on memorisingthe model answers to all past papers.
Now if the upcoming examcompletely consists of past questions, you are certainly to do well. But ifthe new exam asks different questions, you would be ill-prepared. In thiscase, you are overfitting the past exam papers and the knowledge yougained did not generalise to future exam questions.
source: Flach (2012), Machine Learning
Dima [email protected]
COMS21202: Data Acquisition
![Page 26: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/26.jpg)
Generalisation vs. Overfitting
Example
Imagine you are trying to prepare for Symbols, Patterns and Signals examthis June. You have access to previous exam papers and their workedanswers available online. You begin by trying to answer the previouspapers and comparing your answers with the model answers provided.Unfortunately, you get carried away and spend all your time on memorisingthe model answers to all past papers. Now if the upcoming examcompletely consists of past questions, you are certainly to do well. But ifthe new exam asks different questions, you would be ill-prepared.
In thiscase, you are overfitting the past exam papers and the knowledge yougained did not generalise to future exam questions.
source: Flach (2012), Machine Learning
Dima [email protected]
COMS21202: Data Acquisition
![Page 27: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/27.jpg)
Generalisation vs. Overfitting
Example
Imagine you are trying to prepare for Symbols, Patterns and Signals examthis June. You have access to previous exam papers and their workedanswers available online. You begin by trying to answer the previouspapers and comparing your answers with the model answers provided.Unfortunately, you get carried away and spend all your time on memorisingthe model answers to all past papers. Now if the upcoming examcompletely consists of past questions, you are certainly to do well. But ifthe new exam asks different questions, you would be ill-prepared. In thiscase, you are overfitting the past exam papers and the knowledge yougained did not generalise to future exam questions.
source: Flach (2012), Machine Learning
Dima [email protected]
COMS21202: Data Acquisition
![Page 28: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/28.jpg)
Generalisation vs. Overfitting
I Simpler models often give good performance and can be moregeneral
I highly complex models over-fit the training data
two parameters neededy = mx + c
A large number of parametersneeds to be tuned
Dima [email protected]
COMS21202: Data Acquisition
![Page 29: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/29.jpg)
Generalisation vs. Overfitting
I Simpler models often give good performance and can be moregeneral
I highly complex models over-fit the training data
two parameters neededy = mx + c
A large number of parametersneeds to be tuned
Dima [email protected]
COMS21202: Data Acquisition
![Page 30: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/30.jpg)
Generalisation vs. Overfitting
I Simpler models often give good performance and can be moregeneral
I highly complex models over-fit the training data
two parameters neededy = mx + c
A large number of parametersneeds to be tuned
Dima [email protected]
COMS21202: Data Acquisition
![Page 31: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/31.jpg)
Deterministic Models
I Deterministic models produce an output without a confidencemeasure
I e.g. For the fishy model, prediction of whether the fish is salmon orsea bass is given, without an estimate of how good the prediction is
I Deterministic models do not encode the uncertainty in the dataI This is in contrast to probabilistic models (next lecture)
Dima [email protected]
COMS21202: Data Acquisition
![Page 32: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/32.jpg)
Deterministic Models
I Deterministic models produce an output without a confidencemeasure
I e.g. For the fishy model, prediction of whether the fish is salmon orsea bass is given, without an estimate of how good the prediction is
I Deterministic models do not encode the uncertainty in the dataI This is in contrast to probabilistic models (next lecture)
Dima [email protected]
COMS21202: Data Acquisition
![Page 33: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/33.jpg)
Deterministic Models
I Deterministic models produce an output without a confidencemeasure
I e.g. For the fishy model, prediction of whether the fish is salmon orsea bass is given, without an estimate of how good the prediction is
I Deterministic models do not encode the uncertainty in the data
I This is in contrast to probabilistic models (next lecture)
Dima [email protected]
COMS21202: Data Acquisition
![Page 34: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/34.jpg)
Deterministic Models
I Deterministic models produce an output without a confidencemeasure
I e.g. For the fishy model, prediction of whether the fish is salmon orsea bass is given, without an estimate of how good the prediction is
I Deterministic models do not encode the uncertainty in the dataI This is in contrast to probabilistic models (next lecture)
Dima [email protected]
COMS21202: Data Acquisition
![Page 35: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/35.jpg)
Deterministic Models
To build a deterministic model,1. Understand the task2. Hypothesise the model’s type3. Hypothesise the model’s complexity4. Tune/Train the model’s parameters
Dima [email protected]
COMS21202: Data Acquisition
![Page 36: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/36.jpg)
Another Fish ProblemData: a set of data points D = {(x1, y1), (x2, y2), · · · , (xN , yN)} where xi isthe length of fish i and yi is the weight of fish i .
Task: build a model that can predict the weight of a fish from its length
Model Type: assume there exists a polynomial relationship betweenlength and weight
Model Complexity: assume the relationship is linearweight = a + b ∗ length
yi = a + bxi (1)
Model Parameters: model has two parameters a and b which should beestimated.
I a is the y-interceptI b is the slope of the line
Dima [email protected]
COMS21202: Data Acquisition
![Page 37: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/37.jpg)
Another Fish ProblemData: a set of data points D = {(x1, y1), (x2, y2), · · · , (xN , yN)} where xi isthe length of fish i and yi is the weight of fish i .
Task: build a model that can predict the weight of a fish from its length
Model Type: assume there exists a polynomial relationship betweenlength and weight
Model Complexity: assume the relationship is linearweight = a + b ∗ length
yi = a + bxi (1)
Model Parameters: model has two parameters a and b which should beestimated.
I a is the y-interceptI b is the slope of the line
Dima [email protected]
COMS21202: Data Acquisition
![Page 38: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/38.jpg)
Another Fish ProblemData: a set of data points D = {(x1, y1), (x2, y2), · · · , (xN , yN)} where xi isthe length of fish i and yi is the weight of fish i .
Task: build a model that can predict the weight of a fish from its length
Model Type: assume there exists a polynomial relationship betweenlength and weight
Model Complexity: assume the relationship is linearweight = a + b ∗ length
yi = a + bxi (1)
Model Parameters: model has two parameters a and b which should beestimated.
I a is the y-interceptI b is the slope of the line
Dima [email protected]
COMS21202: Data Acquisition
![Page 39: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/39.jpg)
Another Fish ProblemData: a set of data points D = {(x1, y1), (x2, y2), · · · , (xN , yN)} where xi isthe length of fish i and yi is the weight of fish i .
Task: build a model that can predict the weight of a fish from its length
Model Type: assume there exists a polynomial relationship betweenlength and weight
Model Complexity: assume the relationship is linearweight = a + b ∗ length
yi = a + bxi (1)
Model Parameters: model has two parameters a and b which should beestimated.
I a is the y-interceptI b is the slope of the line
Dima [email protected]
COMS21202: Data Acquisition
![Page 40: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/40.jpg)
Determinist Model - Line Fitting
I Finding the linear model parameters amounts to finding the best fittingline given the data
I criterion: The best fitting line is that which minimises a distancemeasure from the points to the line
Dima [email protected]
COMS21202: Data Acquisition
![Page 41: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/41.jpg)
Determinist Model - Line FittingI Find a,b which minimises
R(a,b) =N∑
i=1
(yi − (a + bxi ))2
I This is known as the residualI A method which gives a closed form solution is to minimise the sum of
squared vertical offsets of the points from the line Method ofLeast-Squares
Dima [email protected]
COMS21202: Data Acquisition
![Page 42: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/42.jpg)
Least Squares SolutionExample
The Ceres Orbit of Gauss:
On Jan 1, 1801, the Italian astronomer G. Piazzi discov-ered the asteroid Ceres. He was able to track the asteroidfor six weeks but it was lost due to interference causedby the sun. A number of leading astronomers publishedpapers predicting the orbit of the asteroid. Gauss alsopublished a forecast, but his predicted orbit differed con-siderably from the others. Ceres was relocated by one ob-server on Dec 7 1801 and by another on Jan 1, 1802. Inboth cases the position was very close to that predictedby Gauss. Needless to say Gauss won instant fame inastronomical circles and for a time was more well knownas an astronomer than as a mathematician. One of thekeys to Gauss’s success was his use of the method of leastsquares.
source: Leon (1994). Linear Algebra and its Applications
Dima [email protected]
COMS21202: Data Acquisition
![Page 43: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/43.jpg)
Least Squares SolutionExample
The Ceres Orbit of Gauss:On Jan 1, 1801, the Italian astronomer G. Piazzi discov-ered the asteroid Ceres. He was able to track the asteroidfor six weeks but it was lost due to interference causedby the sun.
A number of leading astronomers publishedpapers predicting the orbit of the asteroid. Gauss alsopublished a forecast, but his predicted orbit differed con-siderably from the others. Ceres was relocated by one ob-server on Dec 7 1801 and by another on Jan 1, 1802. Inboth cases the position was very close to that predictedby Gauss. Needless to say Gauss won instant fame inastronomical circles and for a time was more well knownas an astronomer than as a mathematician. One of thekeys to Gauss’s success was his use of the method of leastsquares.
source: Leon (1994). Linear Algebra and its Applications
Dima [email protected]
COMS21202: Data Acquisition
![Page 44: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/44.jpg)
Least Squares SolutionExample
The Ceres Orbit of Gauss:On Jan 1, 1801, the Italian astronomer G. Piazzi discov-ered the asteroid Ceres. He was able to track the asteroidfor six weeks but it was lost due to interference causedby the sun. A number of leading astronomers publishedpapers predicting the orbit of the asteroid.
Gauss alsopublished a forecast, but his predicted orbit differed con-siderably from the others. Ceres was relocated by one ob-server on Dec 7 1801 and by another on Jan 1, 1802. Inboth cases the position was very close to that predictedby Gauss. Needless to say Gauss won instant fame inastronomical circles and for a time was more well knownas an astronomer than as a mathematician. One of thekeys to Gauss’s success was his use of the method of leastsquares.
source: Leon (1994). Linear Algebra and its Applications
Dima [email protected]
COMS21202: Data Acquisition
![Page 45: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/45.jpg)
Least Squares SolutionExample
The Ceres Orbit of Gauss:On Jan 1, 1801, the Italian astronomer G. Piazzi discov-ered the asteroid Ceres. He was able to track the asteroidfor six weeks but it was lost due to interference causedby the sun. A number of leading astronomers publishedpapers predicting the orbit of the asteroid. Gauss alsopublished a forecast, but his predicted orbit differed con-siderably from the others.
Ceres was relocated by one ob-server on Dec 7 1801 and by another on Jan 1, 1802. Inboth cases the position was very close to that predictedby Gauss. Needless to say Gauss won instant fame inastronomical circles and for a time was more well knownas an astronomer than as a mathematician. One of thekeys to Gauss’s success was his use of the method of leastsquares.
source: Leon (1994). Linear Algebra and its Applications
Dima [email protected]
COMS21202: Data Acquisition
![Page 46: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/46.jpg)
Least Squares SolutionExample
The Ceres Orbit of Gauss:On Jan 1, 1801, the Italian astronomer G. Piazzi discov-ered the asteroid Ceres. He was able to track the asteroidfor six weeks but it was lost due to interference causedby the sun. A number of leading astronomers publishedpapers predicting the orbit of the asteroid. Gauss alsopublished a forecast, but his predicted orbit differed con-siderably from the others. Ceres was relocated by one ob-server on Dec 7 1801 and by another on Jan 1, 1802.
Inboth cases the position was very close to that predictedby Gauss. Needless to say Gauss won instant fame inastronomical circles and for a time was more well knownas an astronomer than as a mathematician. One of thekeys to Gauss’s success was his use of the method of leastsquares.
source: Leon (1994). Linear Algebra and its Applications
Dima [email protected]
COMS21202: Data Acquisition
![Page 47: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/47.jpg)
Least Squares SolutionExample
The Ceres Orbit of Gauss:On Jan 1, 1801, the Italian astronomer G. Piazzi discov-ered the asteroid Ceres. He was able to track the asteroidfor six weeks but it was lost due to interference causedby the sun. A number of leading astronomers publishedpapers predicting the orbit of the asteroid. Gauss alsopublished a forecast, but his predicted orbit differed con-siderably from the others. Ceres was relocated by one ob-server on Dec 7 1801 and by another on Jan 1, 1802. Inboth cases the position was very close to that predictedby Gauss. Needless to say Gauss won instant fame inastronomical circles and for a time was more well knownas an astronomer than as a mathematician.
One of thekeys to Gauss’s success was his use of the method of leastsquares.
source: Leon (1994). Linear Algebra and its Applications
Dima [email protected]
COMS21202: Data Acquisition
![Page 48: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/48.jpg)
Least Squares SolutionExample
The Ceres Orbit of Gauss:On Jan 1, 1801, the Italian astronomer G. Piazzi discov-ered the asteroid Ceres. He was able to track the asteroidfor six weeks but it was lost due to interference causedby the sun. A number of leading astronomers publishedpapers predicting the orbit of the asteroid. Gauss alsopublished a forecast, but his predicted orbit differed con-siderably from the others. Ceres was relocated by one ob-server on Dec 7 1801 and by another on Jan 1, 1802. Inboth cases the position was very close to that predictedby Gauss. Needless to say Gauss won instant fame inastronomical circles and for a time was more well knownas an astronomer than as a mathematician. One of thekeys to Gauss’s success was his use of the method of leastsquares.
source: Leon (1994). Linear Algebra and its Applications
Dima [email protected]
COMS21202: Data Acquisition
![Page 49: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/49.jpg)
Least Squares Solution
I A least squares problem is an overdetermined linear system ofequations (i.e. number of equations >> number of unknowns)
I Such systems are usually inconsistent
Dima [email protected]
COMS21202: Data Acquisition
![Page 50: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/50.jpg)
Least Squares Solution
Minimise residual by taking the partial derivatives, and setting them to zero(using chain rule)
R(a, b) =∑
i
(yi − (a + bxi))2
∂R∂a
= −2∑
i
(yi − (a + bxi)) = 0
∂R∂b
= −2∑
i
xi(yi − (a + bxi)) = 0
aLS = y − bLS x
bLS =
∑i xiyi − Nxy∑i x2
i − Nx2
x ≡ mean of {xi}
Dima [email protected]
COMS21202: Data Acquisition
![Page 51: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/51.jpg)
Least Squares Solution
Minimise residual by taking the partial derivatives, and setting them to zero(using chain rule)
R(a, b) =∑
i
(yi − (a + bxi))2
∂R∂a
= −2∑
i
(yi − (a + bxi)) = 0
∂R∂b
= −2∑
i
xi(yi − (a + bxi)) = 0
aLS = y − bLS x
bLS =
∑i xiyi − Nxy∑i x2
i − Nx2
x ≡ mean of {xi}
Dima [email protected]
COMS21202: Data Acquisition
![Page 52: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/52.jpg)
Least Squares Solution
Minimise residual by taking the partial derivatives, and setting them to zero(using chain rule)
R(a, b) =∑
i
(yi − (a + bxi))2
∂R∂a
= −2∑
i
(yi − (a + bxi)) = 0
∂R∂b
= −2∑
i
xi(yi − (a + bxi)) = 0
aLS = y − bLS x
bLS =
∑i xiyi − Nxy∑i x2
i − Nx2
x ≡ mean of {xi}
Dima [email protected]
COMS21202: Data Acquisition
![Page 53: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/53.jpg)
Least Squares Solution Example
Example
Find the best least squares fit by a linear function to the data
x -1 0 1 2y 0 1 3 9
x = 0.5, y = 3.25
bLS =∑
i xi yi−Nxy∑i x2
i −Nx2 = 21−4×0.5×3.256−4×0.52 = 2.9
aLS = y − bLS x = 3.25− 0.5bLS = 1.8
y = 1.8 + 2.9x
Dima [email protected]
COMS21202: Data Acquisition
![Page 54: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/54.jpg)
Least Squares Solution Example
Example
Find the best least squares fit by a linear function to the data
x -1 0 1 2y 0 1 3 9
x = 0.5, y = 3.25
bLS =∑
i xi yi−Nxy∑i x2
i −Nx2 = 21−4×0.5×3.256−4×0.52 = 2.9
aLS = y − bLS x = 3.25− 0.5bLS = 1.8
y = 1.8 + 2.9x
Dima [email protected]
COMS21202: Data Acquisition
![Page 55: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/55.jpg)
Least Squares Solution Example
Example
Find the best least squares fit by a linear function to the data
x -1 0 1 2y 0 1 3 9
x = 0.5, y = 3.25
bLS =∑
i xi yi−Nxy∑i x2
i −Nx2
= 21−4×0.5×3.256−4×0.52 = 2.9
aLS = y − bLS x = 3.25− 0.5bLS = 1.8
y = 1.8 + 2.9x
Dima [email protected]
COMS21202: Data Acquisition
![Page 56: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/56.jpg)
Least Squares Solution Example
Example
Find the best least squares fit by a linear function to the data
x -1 0 1 2y 0 1 3 9
x = 0.5, y = 3.25
bLS =∑
i xi yi−Nxy∑i x2
i −Nx2 = 21−4×0.5×3.256−4×0.52 = 2.9
aLS = y − bLS x = 3.25− 0.5bLS = 1.8
y = 1.8 + 2.9x
Dima [email protected]
COMS21202: Data Acquisition
![Page 57: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/57.jpg)
Least Squares Solution Example
Example
Find the best least squares fit by a linear function to the data
x -1 0 1 2y 0 1 3 9
x = 0.5, y = 3.25
bLS =∑
i xi yi−Nxy∑i x2
i −Nx2 = 21−4×0.5×3.256−4×0.52 = 2.9
aLS = y − bLS x
= 3.25− 0.5bLS = 1.8
y = 1.8 + 2.9x
Dima [email protected]
COMS21202: Data Acquisition
![Page 58: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/58.jpg)
Least Squares Solution Example
Example
Find the best least squares fit by a linear function to the data
x -1 0 1 2y 0 1 3 9
x = 0.5, y = 3.25
bLS =∑
i xi yi−Nxy∑i x2
i −Nx2 = 21−4×0.5×3.256−4×0.52 = 2.9
aLS = y − bLS x = 3.25− 0.5bLS = 1.8
y = 1.8 + 2.9x
Dima [email protected]
COMS21202: Data Acquisition
![Page 59: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/59.jpg)
Least Squares Solution Example
Example
Find the best least squares fit by a linear function to the data
x -1 0 1 2y 0 1 3 9
x = 0.5, y = 3.25
bLS =∑
i xi yi−Nxy∑i x2
i −Nx2 = 21−4×0.5×3.256−4×0.52 = 2.9
aLS = y − bLS x = 3.25− 0.5bLS = 1.8
y = 1.8 + 2.9x
Dima [email protected]
COMS21202: Data Acquisition
![Page 60: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/60.jpg)
Least Squares Solution - Outliers
I Outliers can have disproportionate effects on parameter estimateswhen using least squares
I Because residual is defined in terms of squared differencesI ‘Best line’ moves closer to outliers (Lab - week 15)
Dima [email protected]
COMS21202: Data Acquisition
![Page 61: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/61.jpg)
Least Squares Solution - Outliers
I Outliers can have disproportionate effects on parameter estimateswhen using least squares
I Because residual is defined in terms of squared differencesI ‘Best line’ moves closer to outliers (Lab - week 15)
Dima [email protected]
COMS21202: Data Acquisition
![Page 62: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/62.jpg)
Least Squares Solution - Outliers
I Outliers can have disproportionate effects on parameter estimateswhen using least squares
I Because residual is defined in terms of squared differencesI ‘Best line’ moves closer to outliers
(Lab - week 15)
Dima [email protected]
COMS21202: Data Acquisition
![Page 63: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/63.jpg)
Least Squares Solution - Outliers
I Outliers can have disproportionate effects on parameter estimateswhen using least squares
I Because residual is defined in terms of squared differencesI ‘Best line’ moves closer to outliers (Lab - week 15)
Dima [email protected]
COMS21202: Data Acquisition
![Page 64: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/64.jpg)
Least Squares Solution - matrix form
I Least squared solution can be defined using matrices and vectorsI Easier when dealing with variables
R(a,b) =∑
i
(yi − (a + bxi ))2 = ‖y− Xa‖2
where y =
y1...
yN
, X =
1 x1...
...1 xN
, a =
[ab
]
y− Xa =
y1 − a− bx1...
yN − a− bxN
Dima [email protected]
COMS21202: Data Acquisition
![Page 65: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/65.jpg)
Least Squares Solution - matrix form
I To solve least squares in matrix form, find aLS;
‖y− X aLS‖2 = 0 (minimise vector’s length)y− X aLS = 0 (optimal vector is of length 0)X aLS = y (re-arrange)
XT X aLS = XT y (to get a square matrix)
aLS = (XT X)−1 XT y (matrix inverse)
aLS = (XT X)−1 XT y
WARNING: This is not a derivation! It merely intends to give youintuition of the solution. For accurate understanding please refer to:this derivation - p8
Dima [email protected]
COMS21202: Data Acquisition
![Page 66: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/66.jpg)
Least Squares Solution - matrix form
I To solve least squares in matrix form, find aLS;
‖y− X aLS‖2 = 0
(minimise vector’s length)y− X aLS = 0 (optimal vector is of length 0)X aLS = y (re-arrange)
XT X aLS = XT y (to get a square matrix)
aLS = (XT X)−1 XT y (matrix inverse)
aLS = (XT X)−1 XT y
WARNING: This is not a derivation! It merely intends to give youintuition of the solution. For accurate understanding please refer to:this derivation - p8
Dima [email protected]
COMS21202: Data Acquisition
![Page 67: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/67.jpg)
Least Squares Solution - matrix form
I To solve least squares in matrix form, find aLS;
‖y− X aLS‖2 = 0 (minimise vector’s length)
y− X aLS = 0 (optimal vector is of length 0)X aLS = y (re-arrange)
XT X aLS = XT y (to get a square matrix)
aLS = (XT X)−1 XT y (matrix inverse)
aLS = (XT X)−1 XT y
WARNING: This is not a derivation! It merely intends to give youintuition of the solution. For accurate understanding please refer to:this derivation - p8
Dima [email protected]
COMS21202: Data Acquisition
![Page 68: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/68.jpg)
Least Squares Solution - matrix form
I To solve least squares in matrix form, find aLS;
‖y− X aLS‖2 = 0 (minimise vector’s length)y− X aLS = 0 (optimal vector is of length 0)
X aLS = y (re-arrange)
XT X aLS = XT y (to get a square matrix)
aLS = (XT X)−1 XT y (matrix inverse)
aLS = (XT X)−1 XT y
WARNING: This is not a derivation! It merely intends to give youintuition of the solution. For accurate understanding please refer to:this derivation - p8
Dima [email protected]
COMS21202: Data Acquisition
![Page 69: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/69.jpg)
Least Squares Solution - matrix form
I To solve least squares in matrix form, find aLS;
‖y− X aLS‖2 = 0 (minimise vector’s length)y− X aLS = 0 (optimal vector is of length 0)X aLS = y (re-arrange)
XT X aLS = XT y (to get a square matrix)
aLS = (XT X)−1 XT y (matrix inverse)
aLS = (XT X)−1 XT y
WARNING: This is not a derivation! It merely intends to give youintuition of the solution. For accurate understanding please refer to:this derivation - p8
Dima [email protected]
COMS21202: Data Acquisition
![Page 70: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/70.jpg)
Least Squares Solution - matrix form
I To solve least squares in matrix form, find aLS;
‖y− X aLS‖2 = 0 (minimise vector’s length)y− X aLS = 0 (optimal vector is of length 0)X aLS = y (re-arrange)
XT X aLS = XT y (to get a square matrix)
aLS = (XT X)−1 XT y (matrix inverse)
aLS = (XT X)−1 XT y
WARNING: This is not a derivation! It merely intends to give youintuition of the solution. For accurate understanding please refer to:this derivation - p8
Dima [email protected]
COMS21202: Data Acquisition
![Page 71: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/71.jpg)
Least Squares Solution - matrix form
I To solve least squares in matrix form, find aLS;
‖y− X aLS‖2 = 0 (minimise vector’s length)y− X aLS = 0 (optimal vector is of length 0)X aLS = y (re-arrange)
XT X aLS = XT y (to get a square matrix)
aLS = (XT X)−1 XT y (matrix inverse)
aLS = (XT X)−1 XT y
WARNING: This is not a derivation! It merely intends to give youintuition of the solution. For accurate understanding please refer to:this derivation - p8
Dima [email protected]
COMS21202: Data Acquisition
![Page 72: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/72.jpg)
Least Squares Solution - matrix form
I To solve least squares in matrix form, find aLS;
‖y− X aLS‖2 = 0 (minimise vector’s length)y− X aLS = 0 (optimal vector is of length 0)X aLS = y (re-arrange)
XT X aLS = XT y (to get a square matrix)
aLS = (XT X)−1 XT y (matrix inverse)
aLS = (XT X)−1 XT y
WARNING: This is not a derivation! It merely intends to give youintuition of the solution. For accurate understanding please refer to:this derivation - p8
Dima [email protected]
COMS21202: Data Acquisition
![Page 73: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/73.jpg)
Least Squares Solution Example - again
Example
Find the best least squares fit by a linear function to the data
x -1 0 1 2y 0 1 3 9
y =
0139
X =
1 −11 01 11 2
XT X =
[1 1 1 1−1 0 1 2
]1 −11 01 11 2
=
[4 22 6
]
aLS = (XT X)−1XT y = 120
[6 −2−2 4
] [1 1 1 1−1 0 1 2
] 0139
=
[1.82.9
]y = 1.8 + 2.9x
Dima [email protected]
COMS21202: Data Acquisition
![Page 74: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/74.jpg)
Least Squares Solution Example - again
Example
Find the best least squares fit by a linear function to the data
x -1 0 1 2y 0 1 3 9
y =
0139
X =
1 −11 01 11 2
XT X =
[1 1 1 1−1 0 1 2
]1 −11 01 11 2
=
[4 22 6
]
aLS = (XT X)−1XT y = 120
[6 −2−2 4
] [1 1 1 1−1 0 1 2
] 0139
=
[1.82.9
]y = 1.8 + 2.9x
Dima [email protected]
COMS21202: Data Acquisition
![Page 75: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/75.jpg)
Least Squares Solution Example - again
Example
Find the best least squares fit by a linear function to the data
x -1 0 1 2y 0 1 3 9
y =
0139
X =
1 −11 01 11 2
XT X =
[1 1 1 1−1 0 1 2
]1 −11 01 11 2
=
[4 22 6
]
aLS = (XT X)−1XT y = 120
[6 −2−2 4
] [1 1 1 1−1 0 1 2
] 0139
=
[1.82.9
]y = 1.8 + 2.9x
Dima [email protected]
COMS21202: Data Acquisition
![Page 76: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/76.jpg)
Least Squares Solution Example - again
Example
Find the best least squares fit by a linear function to the data
x -1 0 1 2y 0 1 3 9
y =
0139
X =
1 −11 01 11 2
XT X =
[1 1 1 1−1 0 1 2
]1 −11 01 11 2
=
[4 22 6
]
aLS = (XT X)−1XT y = 120
[6 −2−2 4
] [1 1 1 1−1 0 1 2
] 0139
=
[1.82.9
]y = 1.8 + 2.9x
Dima [email protected]
COMS21202: Data Acquisition
![Page 77: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/77.jpg)
Least Squares Solution Example - again
Example
Find the best least squares fit by a linear function to the data
x -1 0 1 2y 0 1 3 9
y =
0139
X =
1 −11 01 11 2
XT X =
[1 1 1 1−1 0 1 2
]1 −11 01 11 2
=
[4 22 6
]
aLS = (XT X)−1XT y
= 120
[6 −2−2 4
] [1 1 1 1−1 0 1 2
] 0139
=
[1.82.9
]y = 1.8 + 2.9x
Dima [email protected]
COMS21202: Data Acquisition
![Page 78: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/78.jpg)
Least Squares Solution Example - again
Example
Find the best least squares fit by a linear function to the data
x -1 0 1 2y 0 1 3 9
y =
0139
X =
1 −11 01 11 2
XT X =
[1 1 1 1−1 0 1 2
]1 −11 01 11 2
=
[4 22 6
]
aLS = (XT X)−1XT y = 120
[6 −2−2 4
]
[1 1 1 1−1 0 1 2
] 0139
=
[1.82.9
]y = 1.8 + 2.9x
Dima [email protected]
COMS21202: Data Acquisition
![Page 79: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/79.jpg)
Least Squares Solution Example - again
Example
Find the best least squares fit by a linear function to the data
x -1 0 1 2y 0 1 3 9
y =
0139
X =
1 −11 01 11 2
XT X =
[1 1 1 1−1 0 1 2
]1 −11 01 11 2
=
[4 22 6
]
aLS = (XT X)−1XT y = 120
[6 −2−2 4
] [1 1 1 1−1 0 1 2
]
0139
=
[1.82.9
]y = 1.8 + 2.9x
Dima [email protected]
COMS21202: Data Acquisition
![Page 80: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/80.jpg)
Least Squares Solution Example - again
Example
Find the best least squares fit by a linear function to the data
x -1 0 1 2y 0 1 3 9
y =
0139
X =
1 −11 01 11 2
XT X =
[1 1 1 1−1 0 1 2
]1 −11 01 11 2
=
[4 22 6
]
aLS = (XT X)−1XT y = 120
[6 −2−2 4
] [1 1 1 1−1 0 1 2
] 0139
=
[1.82.9
]y = 1.8 + 2.9x
Dima [email protected]
COMS21202: Data Acquisition
![Page 81: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/81.jpg)
Least Squares Solution Example - again
Example
Find the best least squares fit by a linear function to the data
x -1 0 1 2y 0 1 3 9
y =
0139
X =
1 −11 01 11 2
XT X =
[1 1 1 1−1 0 1 2
]1 −11 01 11 2
=
[4 22 6
]
aLS = (XT X)−1XT y = 120
[6 −2−2 4
] [1 1 1 1−1 0 1 2
] 0139
=
[1.82.9
]
y = 1.8 + 2.9x
Dima [email protected]
COMS21202: Data Acquisition
![Page 82: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/82.jpg)
Least Squares Solution Example - again
Example
Find the best least squares fit by a linear function to the data
x -1 0 1 2y 0 1 3 9
y =
0139
X =
1 −11 01 11 2
XT X =
[1 1 1 1−1 0 1 2
]1 −11 01 11 2
=
[4 22 6
]
aLS = (XT X)−1XT y = 120
[6 −2−2 4
] [1 1 1 1−1 0 1 2
] 0139
=
[1.82.9
]y = 1.8 + 2.9x
Dima [email protected]
COMS21202: Data Acquisition
![Page 83: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/83.jpg)
K-D Least Squares - matrix form
I Matrix formulation allows least squares method to be easily extendedto data points in higher dimensions
I Consider set of points D = {(x1, y1), (x2, y2), · · · , (xN , yN)} where xihas K dimensions
I For a model where yi is linearly related to xi
yi = a0 + a1xi1 + a2xi2 + · · ·+ aK xiK (2)
Dima [email protected]
COMS21202: Data Acquisition
![Page 84: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/84.jpg)
K-D Least Squares - matrix form
I Matrix formulation allows least squares method to be easily extendedto data points in higher dimensions
I Consider set of points D = {(x1, y1), (x2, y2), · · · , (xN , yN)} where xihas K dimensions
I For a model where yi is linearly related to xi
yi = a0 + a1xi1 + a2xi2 + · · ·+ aK xiK (2)
Dima [email protected]
COMS21202: Data Acquisition
![Page 85: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/85.jpg)
K-D Least Squares - matrix form
I Matrix formulation allows least squares method to be easily extendedto data points in higher dimensions
I Consider set of points D = {(x1, y1), (x2, y2), · · · , (xN , yN)} where xihas K dimensions
I For a model where yi is linearly related to xi
yi = a0 + a1xi1 + a2xi2 + · · ·+ aK xiK (2)
Dima [email protected]
COMS21202: Data Acquisition
![Page 86: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/86.jpg)
K-D Least Squares - matrix form
I Solved in the same manner
y(N×1) =
y1...
yN
,
X(N×(K +1)) =
1 x11 · · · x1K...
......
...1 xN1 · · · xNK
,
a((K +1)×1) =
a0a1...
aK
R(a) = ‖y− Xa‖2
aLS = (XT X)−1 XT y
where (XT X) is a (K + 1)× (K + 1) square matrix
Dima [email protected]
COMS21202: Data Acquisition
![Page 87: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/87.jpg)
K-D Least Squares - matrix form
I Solved in the same manner
y
(N×1)
=
y1...
yN
,
X(N×(K +1)) =
1 x11 · · · x1K...
......
...1 xN1 · · · xNK
,
a((K +1)×1) =
a0a1...
aK
R(a) = ‖y− Xa‖2
aLS = (XT X)−1 XT y
where (XT X) is a (K + 1)× (K + 1) square matrix
Dima [email protected]
COMS21202: Data Acquisition
![Page 88: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/88.jpg)
K-D Least Squares - matrix form
I Solved in the same manner
y(N×1) =
y1...
yN
,
X(N×(K +1)) =
1 x11 · · · x1K...
......
...1 xN1 · · · xNK
,
a((K +1)×1) =
a0a1...
aK
R(a) = ‖y− Xa‖2
aLS = (XT X)−1 XT y
where (XT X) is a (K + 1)× (K + 1) square matrix
Dima [email protected]
COMS21202: Data Acquisition
![Page 89: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/89.jpg)
K-D Least Squares - matrix form
I Solved in the same manner
y(N×1) =
y1...
yN
, X
(N×(K +1))
=
1 x11 · · · x1K...
......
...1 xN1 · · · xNK
,
a((K +1)×1) =
a0a1...
aK
R(a) = ‖y− Xa‖2
aLS = (XT X)−1 XT y
where (XT X) is a (K + 1)× (K + 1) square matrix
Dima [email protected]
COMS21202: Data Acquisition
![Page 90: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/90.jpg)
K-D Least Squares - matrix form
I Solved in the same manner
y(N×1) =
y1...
yN
, X(N×(K +1)) =
1 x11 · · · x1K...
......
...1 xN1 · · · xNK
,
a((K +1)×1) =
a0a1...
aK
R(a) = ‖y− Xa‖2
aLS = (XT X)−1 XT y
where (XT X) is a (K + 1)× (K + 1) square matrix
Dima [email protected]
COMS21202: Data Acquisition
![Page 91: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/91.jpg)
K-D Least Squares - matrix form
I Solved in the same manner
y(N×1) =
y1...
yN
, X(N×(K +1)) =
1 x11 · · · x1K...
......
...1 xN1 · · · xNK
, a((K +1)×1) =
a0a1...
aK
R(a) = ‖y− Xa‖2
aLS = (XT X)−1 XT y
where (XT X) is a (K + 1)× (K + 1) square matrix
Dima [email protected]
COMS21202: Data Acquisition
![Page 92: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/92.jpg)
K-D Least Squares - matrix form
I Solved in the same manner
y(N×1) =
y1...
yN
, X(N×(K +1)) =
1 x11 · · · x1K...
......
...1 xN1 · · · xNK
, a((K +1)×1) =
a0a1...
aK
R(a) = ‖y− Xa‖2
aLS = (XT X)−1 XT y
where (XT X) is a (K + 1)× (K + 1) square matrix
Dima [email protected]
COMS21202: Data Acquisition
![Page 93: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/93.jpg)
K-D Least Squares - matrix form
I Solved in the same manner
y(N×1) =
y1...
yN
, X(N×(K +1)) =
1 x11 · · · x1K...
......
...1 xN1 · · · xNK
, a((K +1)×1) =
a0a1...
aK
R(a) = ‖y− Xa‖2
aLS = (XT X)−1 XT y
where (XT X) is a (K + 1)× (K + 1) square matrix
Dima [email protected]
COMS21202: Data Acquisition
![Page 94: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/94.jpg)
General Least Squares - matrix form
I Matrix formulation also allows least squares method to be extended topolynomial fitting
I For a polynomial of degree p + 1
yi = a0 + a1xi + a2x2i + · · ·+ ap xp
i
Dima [email protected]
COMS21202: Data Acquisition
![Page 95: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/95.jpg)
General Least Squares - matrix form
I Solved in the same manner
y(N×1) =
y1...
yN
, X(N×(p+1)) =
1 x1 x2
1 · · · xp1
1 x2 x22 · · · xp
2...
......
......
1 xN x2N · · · xp
N
, a((p+1)×1) =
a0a1...
ap
aLS = (XT X)−1 XT y
where (XT X) is a (p + 1)× (p + 1) square matrix
Dima [email protected]
COMS21202: Data Acquisition
![Page 96: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/96.jpg)
Generalisation and Overfitting - again
Data
p = 3Residual = 3.5744
p = 1Residual = 4.7557
p = 4Residual = 3.4236
p = 2Residual = 3.7405
p = 5Residual = 3.4217
Dima [email protected]
COMS21202: Data Acquisition
![Page 97: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/97.jpg)
Generalisation and Overfitting - again
Data
p = 3Residual = 3.5744
p = 1Residual = 4.7557
p = 4Residual = 3.4236
p = 2Residual = 3.7405
p = 5Residual = 3.4217
Dima [email protected]
COMS21202: Data Acquisition
![Page 98: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/98.jpg)
Generalisation and Overfitting - again
Data
p = 3Residual = 3.5744
p = 1Residual = 4.7557
p = 4Residual = 3.4236
p = 2Residual = 3.7405
p = 5Residual = 3.4217
Dima [email protected]
COMS21202: Data Acquisition
![Page 99: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/99.jpg)
Generalisation and Overfitting - again
Data
p = 3Residual = 3.5744
p = 1Residual = 4.7557
p = 4Residual = 3.4236
p = 2Residual = 3.7405
p = 5Residual = 3.4217
Dima [email protected]
COMS21202: Data Acquisition
![Page 100: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/100.jpg)
Generalisation and Overfitting - again
Data
p = 3Residual = 3.5744
p = 1Residual = 4.7557
p = 4Residual = 3.4236
p = 2Residual = 3.7405
p = 5Residual = 3.4217
Dima [email protected]
COMS21202: Data Acquisition
![Page 101: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/101.jpg)
Generalisation and Overfitting - again
Data
p = 3Residual = 3.5744
p = 1Residual = 4.7557
p = 4Residual = 3.4236
p = 2Residual = 3.7405
p = 5Residual = 3.4217
Dima [email protected]
COMS21202: Data Acquisition
![Page 102: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/102.jpg)
Generalisation and Overfitting - again
I Strong effect on how to generalise to future data
Dima [email protected]
COMS21202: Data Acquisition
![Page 103: COMS21202: Symbols, Patterns and Signals Deterministic](https://reader030.vdocument.in/reader030/viewer/2022012615/619e384cb20c863e3c2075c2/html5/thumbnails/103.jpg)
Further Reading
I Linear Algebra and its applicationsLay (2012)
I Section 6.5I Section 6.6I Available onlinehttp://www.math.usu.edu/powell/pseudoinverses.pdf
Dima [email protected]
COMS21202: Data Acquisition