computacion inteligente least-square methods for system identification
Post on 01-Jan-2016
214 Views
Preview:
TRANSCRIPT
Computacion Inteligente
Least-Square Methods for System Identification
2Contents
System Identification: an Introduction Least-Squares EstimatorsStatistical Properties of least-squares estimatorsMaximum likelihood (ML) estimatorMaximum likelihood estimator for linear model LSE for Nonlinear ModelsDeveloping Dinamic models from DataExample: Tank level modeling
3System Identification: Introduction
Goal– Determine a mathematical model for an unknown system
(or target system) by observing its input-output data pairs
4System Identification: Introduction
Purposes
– To predict a system’s behavior,
– As in time series prediction & weather forecasting
– To explain the interactions & relationships between inputs & outputs of a system
5System Identification: Introduction
Context example
– To design a controller based on the model of a system,
– as an aircraft or ship control
– Simulate the system under control once the model is known
6Why cover System Identification
System Identification
It is a well established and easy to use technique for modeling a real life system.
It will be needed for the section on fuzzy-neural networks.
7Spring Example
Experiment Force(newtons) Length(inches)
1 1.1 1.5
2 1.9 2.1
3 3.2 2.5
4 4.4 3.3
5 5.9 4.1
6 7.4 4.6
7 9.2 5.0
What will the length be when the force is 5.0 newtons?
Experimental data
8Components of System Identification
There are 2 main steps that are involved
– Structure identification
– Parameter identification
9Structure identification
Structure identification
Apply a-priori knowledge about the target system to determine a class of models within which the search for the most suitable model is to be conducted
This class of model is denoted by a function y = f(u,) where:
• y is the model output
• u is the input vector is the parameter vector
10Structure identification
Structure identification
f(u,) depends on
– the problem at hand– the designer’s experience– the laws of nature governing the target system
11Parameter identification
– Training data is used for both system and model.
– Difference between Target System output, yi, and Mathematical Model output, yi, is used to update parameter vector, θ.
^
12Parameter identification
Parameter identification
– The structure of the model is known, however we need to apply optimization techniques
– In order to determine the parameter vector such that the resulting model
describes the system appropriately:
iii u to assignedy with 0yy
13System Identification Process
The data set composed of m desired input-output pairs
– (ui, yi) (i = 1,…,m) is called the training data
System identification needs to do both structure & parameter identification repeatedly until satisfactory model is found
14System Identification: Steps
– Specify & parameterize a class of mathematical models representing the system to be identified
– Perform parameter identification to choose the parameters that best fit the training data set
– Conduct validation set to see if the model identified responds correctly to an unseen data set
– Terminate the procedure once the results of the validation test are satisfactory. Otherwise, another class of model is selected & repeat step 2 to 4
15System Identification Process
Structure and parameter identification may need to be done repeatedly
16
Least-Squares Estimators
17Objective of Linear Least Squares fitting
Given a training data set {(ui, yi), i = 1, …, m} and the general form function:
Find the parameters 1, …, n , such thatestimate
18The linear model
The linear model:
y = 1 f 1(u) + 2 f2(u) + … + nfn(u)
= fT(u, )
where:
– u = (u1, …, up)T is the model input vector
– f1, …, fn are known functions of u
1, …, n are unknown parameters to be estimated
19Least-Squares Estimators
The task of fitting data using a linear model is referred to as linear regression
where:
– u = (u1, …, up)T is the input vector
– f1(u), …, fn(u) regressors
1, …, n parameter vector
20Least-Squares Estimators
We collect training data set {(ui, yi), i = 1, …, m}
System’s equations becomes:
mnmn2m21m1
2n2n222121
1n1n212111
y)u(f...)u(f)u(f
y)u(f...)u(f)u(f
y)u(f...)u(f)u(f
Which is equivalent to: A = y
21Least-Squares Estimators
Which is equivalent to: A = y
– where
)u(f)u(f
)u(f)u(f
A
mnm1
1n11
n
1
A = y = A-1y (solution)
m*n matrix n*1 vector m*1 vectorunknown
22Least-Squares Estimators
We have
– m outputs, and
– n fitting parameters to find
Or – m equations, and
– n unknown variables
Usually m is greater than n
23Least-Squares Estimators
Since
the model is just an approximation of the target system & the data observed might be corrupted,
Therefore
– an exact solution is not always possible!
To overcome this inherent conceptual problem, an error vector e is added to compensate
A + e = y
24Least-Squares Estimators
Our goal consists now of finding that reduces the errors between and
The problem: Find,
estimate
25Least-Squares Estimators
If e = y - A then:
We need to compute:
mi
1i
TT2Tii )Ay()Ay(ee)ay()(E
26Least-Squares Estimators
Theorem [least-squares estimator]
The squared error is minimized when satisfies the normal equation
if is nonsingular, is unique & is given by
is called the least-squares estimators, LSE
27Spring Example
– Structure Identification can be done using domain knowledge.
– The change in length of a spring is proportional to the force applied.
Hooke’s law
length = k0 + k1*force
28Spring Example
29
Statistical Properties of least-squares estimators
30Statistical qualities of LSE
Definition [unbiased estimator]
An estimator of the parameter is unbiased if
where E[.] is the statistical expectation
31Statistical qualities of LSE
Definition [minimal variance]
– An estimator is a minimum variance estimator if for any other estimator *:
where Cov() is the covariance matrix of the random vector
32Statistical qualities of LSE
Theorem [Gauss-Markov]:
– Gauss-Markov conditions:
• The error vector e is a vector of m uncorrelated random variables, each with zero mean & the same variance 2.
• This means that:
33Statistical qualities of LSE
Theorem [Gauss-Markov]
LSE is unbiased & has minimum variance.
Proof:
34
Maximum likelihood (ML) estimator
35Maximum likelihood (ML) estimator
The problem
– Suppose we observe m independent samples
x1, x2, …, xm,
– coming from a probability density function with parameters 1, …, r
36Maximum likelihood (ML) estimator
The criterion for choosing is:
– Choose parameters that maximize data probability
Which one do you prefer? Why?
37Maximum likelihood (ML) estimator
Likelihood function definition:
– For a sample of n observations x1, x2, …, xm
– with independent probability density function f,– the likelihood function L is defined by
L is the joint probability density
38Maximum likelihood (ML) estimator
ML estimator is defined as the value of which maximizes L:
or equivalently:
39Maximum likelihood (ML) estimator
Example: ML estimation for normal distribution
– Suppose we have m indipendent samples x1, x2, …, xm, coming from a Gaussian distribution with parameters μ and σ2.
Which is the MLE for μ and σ2?
2x
2
1exp
2
1),;x(f
40Maximum likelihood (ML) estimator
Example: ML estimation for normal distribution
– For m observations x1, x2, …, xm, we have:
2
22 2
1
1,
2
ixm
i
L e
41Maximum likelihood (ML) estimator
Example: ML estimation for normal distribution
– For m observations x1, x2, …, xm, we have:
42
Maximum likelihood estimator for linear model
43Maximum likelihood estimator for linear model
– Let a linear model be given as
– Then
– here e has PDF pe(u,θ) (independent). The likelihood function is given by
44Maximum likelihood estimator for linear model
– Asume a regression model where errors are distributed normally with zero mean.
– The likelihood function is given by
45Maximum likelihood estimator for linear model
The maximum likelihood model
– Any algorithm that maximizes
– gives de Maximum likelihood model with respect to a
given family of possible models
46Maximum likelihood estimator for linear model
– Same as maximizing
– Same as minimizing
47Connection to Least Squares
Conclusion
– The least-squares fitting criterion can be understood as emerging from the use of the maximum likelihood principle for estimating a regression model where errors are distributed normally.
– The applicability of the least-squares method is, however, not limited to the normality assumption.
48
LSE for Nonlinear Models
49LSE for Nonlinear Models
Nonlinear models are divided into 2 families
– Intrinsically linear– Intrinsically nonlinear
• Through appropriate transformations of the input-output variables & fitting parameters, an intrinsically linear model can become a linear model
• By this transformation into linear models, LSE can be used to optimize the unknown parameters
50LSE for Nonlinear Models
Examples of intrinsically linear systems
51
Developing Dinamic models from Data
52Dynamical System?
Input u(t) Output y(t)
System
))(),...,2(),1(),(),...,1(),(()(ˆ mtututuntytytySty
53The ARX model
In dynamic systems analysis, the independent variable is often time (k)
– A ARX model (AutoRegressive with eXogenous input model) is often used where
54The ARX model
Or equivalently
– writing
55The ARX model as a linear regressor
Input-output relationship can take the form
– where
Regression vector
Parameter vector to estimate
56Prediction error model estimation
The problem– Assume input-output data
– Build the predictor
– Such that minimizes Prediction ErrorPrediction Error
57Prediction error model estimation
– The model is fitted to the data by minimizing the criterion function
2
1
1 N
Nk
V kN
Which gives the least squares criterion
58Prediction error model estimation
Solution
– Normal equation
– Estimates
1 1
1 1N NT
Nk k
k k k y kN N
1 1
1 1N NT
Nk k
k k k y kN N
1
1 1
ˆ arg min ( ) ( ) ( ) ( ) ( )N N
LS TN N
k n
V k k k y k
1
1 1
ˆ arg min ( ) ( ) ( ) ( ) ( )N N
LS TN N
k n
V k k k y k
59Prediction error model estimation
In matrix form, the solution is the standard linear least squares formula
1ˆ TN y
1ˆ T
N y
60
Example: Tank level modeling
61Example: Tank level modeling
62Example Tank level modeling
The identification goal– To explain how the voltage u(t) (the input) afects the
water level h(t) (the output) of the tank
Experimetal data
63Simple ARX modeling
A plausible first identification attempt is to try a simple linear regression model
– The parameters can easily be estimated using linear least squares, resulting in
64ARX model results
– Simulated water level follows the true level but at levels close to zero the linear model produces negative levels.
65Semiphysical modeling
Model equation is based on dynamic conservation of mass
– Accumulation of mass in the tank is equal to:
the mass flow rate into the tank
the mass flow rate out.
i o
dhA q q
dt minus
66Semiphysical modeling
While the inflow is roughly proportional to u(t) the outflow can be approximated using Bernoulli’s law
– The parameters can easily be estimated using linear least squares, resulting in
67Semiphysical model results
The RMS error of this model is lower and more importantly no simulated output is negative which indicates that the model is physically sound
68Sources
J-Shing Roger Jang, Chuen-Tsai Sun and Eiji Mizutani, Slides for Ch. 5 of “Neuro-Fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence”, First Edition, Prentice Hall, 1997.
Djamel Bouchaffra. Soft Computing. Course materials. Oakland University. Fall 2005
Henrik Melgaard, Identication of Physical Models. Institute of Mathematical Modelling, Technical University of Denmark. Ph.D. THESIS. 1994
Lucidi delle lezioni, Soft Computing. Materiale Didattico. Dipartimento di Elettronica e Informazione. Politecnico di Milano. 2004
Peter Lindskog, Fuzzy Identification from a Grey Box Modeling Point of View. Department of Electrical Engineering, Linkoping University. 1997
Jacob Roll, Local and Piecewise Afinne Approaches to System Identification. Department of Electrical Engineering, Linkoping University, Linkoping, Sweden. 2003
top related