ldentification, adaptation, learning - intranet...

13
ldentification, Adaptation, Learning TheScience of Learnino Models from Data Edited by Sergio BittantiGiorgio Picci NATO ASI Series Series F: Computer and Svstems Sciences. Vol. 153

Upload: dohanh

Post on 21-Feb-2019

222 views

Category:

Documents


0 download

TRANSCRIPT

ldentification, Adaptation,Learning

The Science of Learnino Models from Data

Edited bySergio Bittanti Giorgio Picci

NATO ASI Series

Series F: Computer and Svstems Sciences. Vol. 153

Identifi cation, Adaptation,LearningThe Science of Learning Models from Data

Edited by

Sergio BittantiPolitecnico di MilanoPiazza Leonardo da Vinci 32I-20133 Milano, Italy

Giorgio PicciUniversiti di PadovaVia Gradenigo 6/AI-35131 Padova,Italy

SpringerPublished in cooperation with NATO Scientific Affairs Division

Dedicated to thc memory ofE.J. Hannan (1921 1994)

Photograph by Peter Halltaken at the farewell to Geof Watsol

Princeton University, 1992

Preface

This book collects the Iectures giver at the NATO Advanced Study InstituteFrom ldenti.fication to Leatni.ng held in Villa Olmo, Como, Italy, from August22 to September 2, 1994.

The school was devoted to the themes of ld.entification, Ad,aptation andLearning, as they are currently understood in the Information and Controlengineering community, their development in the last few decades, their inter-conrectiors and their applications. These titles describe challenging, excitingand rapidly growing research areas which are of interest both to control andcommunication engineers and to statisticians and computer scientists_

In accordance with the general goals ofthe Institute, and notwithstandingthe rather advanced level of the topics discussed, the presentations have beengenerally kept at a fairly tutorial level. For this reason this book should bevaluable to a variety of rearchers and to graduate students interested in thegeneral area of Control, Signals and Information Processing. As the goal ofthe school was to explore a common methodological line of r.eading the issues,the flavor is quite interdisciplinary. We regard this as au original and valuablefeature of this book.

During the two weeks of the school at Villa Olmo we have experienced aunique atmosphere and a most remarkable climate of interaction and com-munication between the outstanding experts gathered in Como for this occa-sion. It is remarkable that some of them hardly ever meet at conferences orat scientific meetings as their diferent fields have traditionally evolved alongsepalate lines. The openness and active participation in discussions by bothstudents and speakers, was a major point for the success of this AdvancedStudy Institute. The editors of this volume would like to thank a.ll lecturers,and the remaining members of the Organizing Committee, S.K. Mitter andJan C. Willems, for their helpful advice.

Ihe superb local organization provided by the Centro di CuLtum Sci,-entifr,ca Alessandro l/olla deserves a primary acknowledgement, for it was amajor factor for the smooth development of the school. Special thanks goto Manuela toglio for her care in the general orga.nizational aspects, and toEnanuela Salati for her kindness and patience in dealing with daily pr.oblemsof students and teachers.

VIII Preface

Lnst but not least, we would like to thank NATO for believing in thisproject and for the generous support of the Institute, and the ConsiglioNazionale delle fucerche (CNR) of ltaly which also provided financial fund-ing. The general support of the Dipaftimento d,i Elettronica e Informaz.ioneof the PoliLecnico di Mileno is also gratefully acknowledged.

The final software layout of the book is due to Stefano Bertoncello, withthe assistance of Marco Lovera.

March 1996 Sergio Bittanti and Giorgio Picci

Table of Contents

Geometric Methods for State Space IdentificationAnders L indqu is t a ld G iorg io P icc i . . . . . . . . . . . . . 1

1 .

2 .3 .4.

t4I

1015202528

Introduction.1.1 Stationary Signals and the Statistical Theory of Model Buildi4g1.2 Input-Output Models . .State Space Models of Stationary ProcessesSpectral Factorization . .. . .Spectral Factorization and the LMI .. . .4.1 Ordering, (A,C) Pairs and Uniform Choice of Basis in I . . . . .Finite-Interval Realizations of a Stationary Process . .5.1 Forward and Backward Kalma.n Filtering and the Family of

Minimal Stationary Realizations of U .....5.2 Finite-Interval Realizatiols.

6. Estimation, Partial Realization and Balancing6.1 Positivity6.2 The Hilbert Space of a Stationary Signal . . .6.3 ldentification Based on Finite Data6.4 The Partial Realization Problem.6.5 Partial Realization via SVD . ...6.6 Stochastic Ba.lanced Realizations: the Stationary Setting .. . ...6.7 Stochastic Balanced Realizations: the Case of Finite Data. .. . _

7. The "Subspace Methods" Identification Algorithm ofVan Overscheeand DeMoor.7.1 Choosing Bases in the Predictor Spaces.. .7.2 Skipping some Redundant Steps .. .7.3 The Least Squares Implementation .. ..7.4 Use of the SVD . . . .

Parameter Estimation of Multivariable Systems UsingBalanced RealizationsJ . M . M a c i e j o w s k i . . . . . . . . . . . 7 0

t .

2832

40434546495156

5657626364

Table of Contents

2. Prob lem Set t ing . . . . . . . . . T13. Identif iable Parametrizations . .. .. .. ?64. Balanced Paxametrization . . . -.. ... . 7g5 . Some Usefu l C lasses o f Mode ls . . . . . . g l

5 .1 \4 in imum-Phase Mode ls . . . . . . . 8 i5.2 Positive-Real Models .. . . .. . . .. 83

6 . O u t l i n e o f P a r a m e t e r E s t i m a r r o n . . . . . . . . . . . . . g 67 . Grad ien t Ca lcu la t ions . . . . . . . . . . . . . 888 . F i n d i n g a n I n i t i a l M o d e l . . . . . . . . . . . 9 1

8 . 1 A v a i l a b l e M e t h o d s . . . . . . . . . . . . 9 18 .2 Rea l iza t ion Methods . . . . . . - . . . 928 .3 Subspace Methods . . . . . . . . . . . . 988 . 4 G u a r a n t e e i n g S t a b i l i t y . . . . . . . . . 1 0 48 . 5 E s t i m a t i n g n . . . . . . . . . . . . . . . . . 1 0 6

9 . E x a m p l e s . . . . . . 1 0 79 . 1 D i s t i l l a t i o n C o l u m n . . . . . . . . . . . 1 0 79 . 2 I n d u s t r i a l D r y e r . . . . . . . . . . . . . . 1 1 19 . 3 S e a W a v e S p e c t r u m . . . . . . . . . . . I t 2

1 0 . C o n c l u s i o n s . . . . . . . . . . . . . 1 7 4

Balanced Canonical FormsR a i m u n d J . O b e r . . . . . . . . . . . t 2 O

1 . I n t r o d u c t i o n . . . . . . . . . . . . . 1 2 02. Lyapunov Balanced Realizations and Model Reduction . ........ .. I2b3. A Lyapunov Balanced Cauonical Form for Stable

C o n t i n u o u s - T i m e S y s t e m s . . . . . . . . . . l 2 g4. L-Characteristic, LQG-Balanced Canonical Form and

N4ode l Reduct ion fo r Min ima l Sys tems. . . . . . . . 1465. Characteristics, Canonical Forms and Model Reduction

fol Bounded-Real and Positive-Real Systems . .. .... ... . l t j26 . C o n c l u d i n g R e m a r k s . . . . . . . . . . . . . . 1 7 9

Flom Data to State ModelPaolo Rapisarda and Jan C. Willems 184

184187187191193195

Variable Representations . . ..... 1962.6 Recap i tu la t ion . . . . . . . . . . . . . . _ . 199

3. From Difference Equation to State Models . . ... . . .. .... 199

1. Introduction.2. Background

2-1 Discrete Time Systems2.2 Late\l Variables2.3 State Models ..2.4 Existence a.nd Uniqueness of State Space Models. .2.5 Input/Staie/Output, Output Nulling, and Driving

Table of Contents

3.1 Basic Not ions . . . . . . . 2003 . 2 F r o m K c r n e l R e p r e s e n t a t i o n s t o X ( { ) . . . . - . . . . . . . . 2 0 23.3 From Hybr id Representat ion io X(0 . . . . .2043.4 From Image Representation to X({) ... . .. 2083 . 5 F I o m X ( { ) t o S t a t e E q u a t i o n . . . . . . . . . . . . 2 0 93 . 6 Y ( € ) . . . . . . 2 r r3 . 7 S i m u l a t i o n . . . . . . . . . . 2 1 33 . 8 R e c a p i t u l a t i o n . . . . . . . . . . . . . . . . 2 1 4

4. The Most Power fu l Unfa ls i f ied Mode l . . . . . . . . .2144 . 1 B a s i c s . . . . . 2 I b4.2 Existence of the Most Powerful Unfalsif ied Model . .. . .. . . .. . . 2164.3 From T ime Ser ies to S ta te Space Mode l . . . . . . . . . . . . 2174.4 Rea l iza t ion Theory as a Spec ia l Case . . . . . . . . . . . . . . 2L94 . 5 R e c a p i t u l a t i o n . . . . . . . . . . . . . . . . 2 2 0

5 . A l g o r i t h m s . . . . . 2 2 05 . 1 F \ ' o m B e h a v i o r t o S t a t e S p a c c M o d e l . . . . . . . . . . . . . . 2 2 75.2 F \ 'om T ime Ser ies to S ta te Space Mode l I . . . . . . . . . . 2245.3 Common Features and Relative Row Rank . . . ...... 2225 . , 1 F r o m T i m e S e r i e s t o S t a t e M o d e l I i . . . . . . . . . - . . . . . 2 2 95.5 Subspace ldent i f i ca t ion . . , . . - . . 2325.6 Rea l iza t ion Theory as a Spcc ia l Case . . . . . . . . . . . . . . 234

6 . A p p l o x i m a t e M o d e l i n g . . . . . . . . . . . . 2 3 77 . S i m u l a t i o n s . . . . . . . . . . . . . 2 3 9A . N o t a t i o n - . . . . . . 2 4 J

Identiflcation of Linear Systerrs from Noisy DataManf red De is t le r and Wol fgang Scher rer . . . . . . . . . 246

1 . I n t r o d u c t i o n . . . . . . . . . . . . . 2 4 62 . T h e M o d e l . . . . . 2 4 93. The Frisch Case, Bivariate Observations ...... 2544 . T l r e F r i s c h C a s e , G e n e r a l n . . . . . . . . . . . . . . . . . . 2 5 55 . T h e B o u n d e d N o i s e C a s e . . . . . . . . . . 2 6 1

Identification in I{- : Theory and ApplicationsPramod P. Khargonekar, Guoxiang Gu, and Jonathan Fliedman ......266

1 . I n t r o d u c t i o n . . . . . . . . . . . . 2 6 62 . P r o b l e m F o r m u l a t i o n . . - - . . . . . . . . . 2 6 7

2.1 D iscre te-T ime Sys tems . . . . - . . . 2672.2 Exper imenta l Data . . . . . . . . _ . . . 2682.3 Ident i f i ca t ion in f I * . . . . . . . . _ 269

3 . B a c k g r o u n d R e s u l t s . . . . . . . . . . . . . . . 2 7 04. L incar A lgor i thms. . . . . . . . 27 I

4 . 1 T h e K e r n e l F u n c t i o r . . . . . . . . . . 2 7 24.2 E l ro r Ana lys is . . . . . . . 21 j

Table of Contents

5 . N o n l i n e a r A l g o l i t h m s . . . . . . . . . . . . . . 2 7 b5 . 1 T w o - S t a g e N o n l i n e a r A l g o r i t h m . . . . . . . . . 2 7 55 . 2 C o n v e x a n d C o n c a v e W i n d o w s . . . . . , . . , . . 2 7 75.3 Flequency Domain Analysis . . . . . . . . . . . . . 2725 . 4 T l a p e z o i d a l W i n d o w . . . . . . . . . . 2 7 8

6. Engineering Applications. .. .. . . .. .. 27g

System Identifi cation with Information Theoretic CriteriaA.A. S toorvoge l and J .H. van Schuppen . . . . . . . . 289

1 . I l t r o d u c t i o n . . . . . . . . . . . . . 2 8 92. Ploblcm Formulation ... .. 2gO3. Approximation with Mutual Information. ....- 2g2

3 . 1 M u t u a l I n f o f m a t i o n . . . . . . . . . . . 2 g 23.2 A Pa lameter Es t imat ion prob lem . . . . . . . .2g43.3 Relation of Mrrtual Information, I/* Entropy, and LEeG Cost 2953.4 Parameter Estimation with an Exponenfial_of_euaaratic Cost . 2963.5 Parameter Estimation with II_ Ertropy. .. .. . . .. .. 298

4. Approximation with Likelihood and Divergence . _....... 3004.1 Approximation rvith the Likelihood Function . . . .. .. JOt)4 . 2 D i v e r g e n c e . . . . . . . . . . j 0 14 .3 Re la t ion o f L ike l ihood Funct ion and D ivergence. . . . . . . . . . . . . . 3024.4 Approx imat ion w i th D ivergence . . . . . . . . . . 3044.5 Pararneter Estimation by Divergence Minimization .. . .. . . .. . . 306

5 . C o n c l u d i n g R e m a r k s . . . . . . . . . . . . . . 3 0 8A. Concepts from Probabitity and the Theory of Stochastic processes . 310

A.1 Probab i l i t y Concepts . . . . . . . . . . J1{JA . 2 G a u s s i a n R a n d o m V a r i a b l e s . . . - . . . . . . . . . 3 1 0A.3 Concepts f rom the Theory o f S tochas t ic p rocesses . . . . . . . . . . . . J12

B. Concepts f rom Sys tem Theory . . . . . . . . . . . . . . . 31 i3C. Concepts from Information Theory.. . .. . . .. . . 314D. Information Measures of Gaussian Random Va.riables ,.. . .. . ...... JIzE. Information Measures of Stationary Gaussian processes... .. . . .. . . 320F. LEQG Optimal Stochastic Conrror . .. . . .. . . .. J27G. H-Infinity Control with an Entropy Criterion .. . .. . . .... J32

Least Squares Based Self-T\rning Control SystemsSerg io B i t tan t i and Marco Campi . . . . . . . . . . . . . . . 3391 . I n t r o d u c t i o n . . . . . . . . . . . . . J j g2 . Se l f - l \n ing Adapt ive Cont ro l . . . . . . .841

! Basrc Se l f -T \n ing Concepts . . . . . . . . . . . . . . J41

2.2 Mathemat ica l F lamework . . . . . . :1423

-\ lhaj \4akes a ST Control System Nice? . .. . .. .. .. ... . J45

3.1 Imaginary and Asymptotic Imaginary Systems .. ... ... . ... .. 3453.2 Adaptivc Stabil ization . . . .. . ... :146

A

5 .

7 .

Table of Contents XIII

3 . 3 S e l f - O p t i m a l i t y . . . . . . . . . . . . . . . 3 4 73 . 4 L i s t o f S y m b o l s . . . . . . . . . . . . . . . 3 4 9The Least Squares Identif ication Algorithm ... ... . . .... 350Deterministic Analysis of Self-I\rning Control Systems . . . . . . . . . . . . 3515 . 1 E x c i t a t i o n S u b s p a c e a n d C o n v e r g e n c e o f R L S . . . . . . . . . . . . . . . 3 5 15.2 Adapt ive Stab i l i za t ion and Se l f -Opt ima l i t y . . . . . . . . . . . . . . . . . 3545.3 Spec i f i c Cont ro l Laws . . . . . . . . . 358Stochastic Analysis of Self-T\rning Control Systems. . . . . . . . . . . . . . . 3606.1 Stochastic Excitation Subspace and Convergence of RLS .. . .. 3606.2 Stochastic Adaptive Stabilization and Self-Optimality . . ... . .. 362Concluding Remarks Towards a Theory of Tuning .....363

On Neural Network Model Structures in SystemIdentificationL. Ljung, J. Sjoberg, and H. Hjalmarsson

I . ln t roduc t ion and Summary .1.1 What is the Problem?1.2 Black Boxes . . .1.3 Nonlinear Black Box Models .1 . 4 E s r i m c r i n g g v . . . . . .1.5 Properties of the Estimated Model . . .1 .6 Bas is Func t ions1.7 What Is the Neural Network Identif ication Approach? .. . ... ..1-8 Why Have Neural Networks Attracted So Much Interest? ......[.9 Rela t ed Approaches

2. The Problem2.1 Inferring Relationships from Data.9 9 P r i n r A c e , , m n t i n n q

2.3 Func t ion C lasses . .3. Some General Estimation Results ..4. The Bias/Variance Trrdp-Off5 . N e u r a l N e t s . . . .

5 .1 Feedforward Neura l Npts .5 .2 Recur r r .n t Neura l Nets .

6. Algorith mic Aspects . .6.1 Search Directions6.2 Back-Propagation: Calculation of the Gradient6 .3 lmp l ic i t Rcgu lar izar ion6.4 Ofiline and On-line Algorithms.6 . 5 L o c a l M i n i m a . . - . . . . . .

7 . Adapt ive Methods .7.1 Adaptive Basis Function Expansion7.2 The "Curse" of Dimensionality7.3 Methods to Avoid the "Curse". .

366

366366367368368369369370370371371371372372

375376376378380380382383384385385385386387

XIV 'lrble of Contents

8. Specific Properties of NN Structures.f. i\4odels of Dynamical Systems Based on Neural Networks

9.1 A Review of Linear Black Box Models . .9.2 Choice of Regressors for Neural Network Models9.3 Neulal Network Dynamic Models . .9.4 Some Other Structural Questions9.5 Thc Identification Procedure

An Overview of Computational Learning Theory and ItsApplications to Neural Network TlainingM. Vidyasagar. . . . . . . . . . . . . . 400

1 . I n t r o d u c t i o n . . . . . . . . . , . . . 4 0 02. P lob lem Formula t ion . . . . . 4013 . S u m m a r y o f K n o w n R e s u l t s . . . . . . . . 4 0 34. Families of Measures with a Nonempty Interior . . ..... .. 4075 . T o t a . l l y B o u n d e d F a m i l i e s o f M e a s u r e s . . . . . . . . . . . . . . _ . . 4 0 96 . T w o S u f l i c i e n t C o n d i t i o n s . . . . . . . . . . 4 7 47. Conclusions . . . . . . . . . . . . . 416A p p e n d i x : A C o u n t e r e x a m p l e . . . . , . , . . . . . . . . . . . 4 1 7

Just-in-Time Learning and EstimationG e o r g e C y b c n k o . . . . . . . . . . . . 4 2 3

1 . I n t r o d u c t i o n . . . . . . . . . . . . . 4 2 32 . G l o b a l M o d e l s . . . . . . . . . . . 4 2 53 . L o c a l M o d c l s . . . . . . . . . . . . 4 2 74 . J u s t - I n - T i m e M o d e l s . . . . . . . . . . . . . . 4 2 g

4.1 Analysis of Just-In-Time Models .. . . . .... 42g4 . 2 D i s c u s s i o n . . . . . - - . . . 4 8 2

Wavelets in IdentificationA. Benveniste, A. Juditsky, B. Delyon, Q. Zhang, and p-y. Glorennec . 435

1. Introduction, Motivations, Basic Problems1.1 Two Application Examplcs .

2. Basic Mathemar ical Problcms3. Classical Methods of Nonlinear System Identification:

L i n e a r N o n p a r a m e t r i c E s t i m a t o $ . . . . . . . . . . . . 4 4 03.1 Projection Estimates as an Example of Linear

Nonparametric Estimators ..... 44L3.2 Choice of Model Order, Bandwidth, or Binwidth:

the Generalized Cross Validation GCV Method. . . ...4. Performance Analysis of the Nonparametric Estimators . . . . . .

4.1 Lower Bounds lor Best Achievable Performance4.2 Discussion

388389389391:192393393

435436438

442/t A.l

444446

447

449449

450453456458459459460

462464464465470

5 .6.

Table of Contents

Nonlinear Estimates .Wavelets: What They Are, and Their Use inApploximating Fu ncl,ions6.1 The Continuous Wavelet Ttansform6.2 The Discrete Wavelet Tlansform:

Orthonormal Bases of Wavelets and Extensions6.3 Wavelcts and Besov Spaces. . .Wavelets: Their Use in Nonparametric Estimation.A Wavelet Network for Practical System Identification .8.1 The Wavelet Network and Its Structure.8.2 Constructing the Wavelet Llbraty W8.3 Selecting Best Wavelet Regressors.Fuzzy Models; Expressing Prior Information inNonlinear NonparameLric Models ..

10. Experimental Results . .10.1lr4odell ing the Gas T\rrbine System..10 .2 Mode l l ing the Hydrau l i c Ac tua tor o f the Robot Arm . . . . . . . . .

. l I. Discussion and Conclusions

Frtzzy Logic Modelling and Control

M o t i v a t i o n . . . . . 4 7 9Fuzzy Logic Basic Concepts .. ... . .. 4812. I F\zzy Logic Var iables . . . . . . . . . 4822.2 Frzzy Logic Operat ions. . . . . . . . 4852.3 Approximated Reasoning . . . . . . 4862 . 4 D e f u z z l f i . c a t i o n . . . . . . . . . , , . . , . . 4 9 02.5 Fuzzy Systems. .. . . .. 492F u z z y L o g i c C o n t r o l l e r S t r u c t u r e . . . . . . . . . . . . . 4 9 43 . 1 F u z z i f i e r . . . . . . . . . . . . 4 9 53.2 Fttzzy Operator . .. . .. 4953 . 3 D e f u z z i f i c a t i o n . . . . . . . - - . . . - . . . 4 9 7F L C A n a l y s i s . . . . . . . . . . . 4 9 74.7 Ft,zzy Systems Approximation Properties .. .... ... . 499F u z z y L o g i c C o n t r o l l e r s D e s i g n . . . . . . . . . . . . . . 5 0 05.1 Experimenta.l FLC Design. ..... 5015 . 2 F L C f r o m L i n e a r O n e s . . . . - . . . . 5 0 25.3 FLC Supervision. . . . . . . . . . . . . . 503Process Fuzzy Model l ing . . . . . . . . . . . 504Cement Kiln Control . . . . . . . . . . . . . . 5057.1 Process Descr ip t ion . . . . . . . . . . . 5067 . 2 C o n t r o l S t r u c t u r e , , . , , , . . . , . . . 5 0 7

6.7.

XVI lable of Contents

Searching for the Best: StoclAnnealing and Relared pro"::",;ti:"App"oximation, simulated

G e o r g P f l u g . . . . . . . 5 1 4

L i s t o f C o n t r i b u t o r s . . . . . . . . . . .. . . . . . . 5 5 0