ldentification, adaptation, learning · 2006. 7. 20. · generally kept at a fairly tutorial level....

13
ldentification, Adaptation, Learning TheScience of Learnino Models from Data Edited by Sergio BittantiGiorgio Picci NATO ASI Series Series F: Computer and Svstems Sciences. Vol. 153

Upload: others

Post on 28-Jan-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

  • ldentification, Adaptation,Learning

    The Science of Learnino Models from Data

    Edited bySergio Bittanti Giorgio Picci

    NATO ASI Series

    Series F: Computer and Svstems Sciences. Vol. 153

  • Identifi cation, Adaptation,LearningThe Science of Learning Models from Data

    Edited by

    Sergio BittantiPolitecnico di MilanoPiazza Leonardo da Vinci 32I-20133 Milano, Italy

    Giorgio PicciUniversiti di PadovaVia Gradenigo 6/AI-35131 Padova,Italy

    SpringerPublished in cooperation with NATO Scientific Affairs Division

  • Dedicated to thc memory ofE.J. Hannan (1921 1994)

    Photograph by Peter Halltaken at the farewell to Geof Watsol

    Princeton University, 1992

  • Preface

    This book collects the Iectures giver at the NATO Advanced Study InstituteFrom ldenti.fication to Leatni.ng held in Villa Olmo, Como, Italy, from August22 to September 2, 1994.

    The school was devoted to the themes of ld.entification, Ad,aptation andLearning, as they are currently understood in the Information and Controlengineering community, their development in the last few decades, their inter-conrectiors and their applications. These titles describe challenging, excitingand rapidly growing research areas which are of interest both to control andcommunication engineers and to statisticians and computer scientists_

    In accordance with the general goals ofthe Institute, and notwithstandingthe rather advanced level of the topics discussed, the presentations have beengenerally kept at a fairly tutorial level. For this reason this book should bevaluable to a variety of rearchers and to graduate students interested in thegeneral area of Control, Signals and Information Processing. As the goal ofthe school was to explore a common methodological line of r.eading the issues,the flavor is quite interdisciplinary. We regard this as au original and valuablefeature of this book.

    During the two weeks of the school at Villa Olmo we have experienced aunique atmosphere and a most remarkable climate of interaction and com-munication between the outstanding experts gathered in Como for this occa-sion. It is remarkable that some of them hardly ever meet at conferences orat scientific meetings as their diferent fields have traditionally evolved alongsepalate lines. The openness and active participation in discussions by bothstudents and speakers, was a major point for the success of this AdvancedStudy Institute. The editors of this volume would like to thank a.ll lecturers,and the remaining members of the Organizing Committee, S.K. Mitter andJan C. Willems, for their helpful advice.

    Ihe superb local organization provided by the Centro di CuLtum Sci,-entifr,ca Alessandro l/olla deserves a primary acknowledgement, for it was amajor factor for the smooth development of the school. Special thanks goto Manuela toglio for her care in the general orga.nizational aspects, and toEnanuela Salati for her kindness and patience in dealing with daily pr.oblemsof students and teachers.

  • VIII Preface

    Lnst but not least, we would like to thank NATO for believing in thisproject and for the generous support of the Institute, and the ConsiglioNazionale delle fucerche (CNR) of ltaly which also provided financial fund-ing. The general support of the Dipaftimento d,i Elettronica e Informaz.ioneof the PoliLecnico di Mileno is also gratefully acknowledged.

    The final software layout of the book is due to Stefano Bertoncello, withthe assistance of Marco Lovera.

    March 1996 Sergio Bittanti and Giorgio Picci

  • Table of Contents

    Geometric Methods for State Space IdentificationAnders L indqu is t a ld G iorg io P icc i . . . . . . . . . . . . . 1

    1 .

    2 .3 .4.

    t4I

    1015202528

    Introduction.1.1 Stationary Signals and the Statistical Theory of Model Buildi4g1.2 Input-Output Models . .State Space Models of Stationary ProcessesSpectral Factorization . .. . .Spectral Factorization and the LMI .. . .4.1 Ordering, (A,C) Pairs and Uniform Choice of Basis in I . . . . .Finite-Interval Realizations of a Stationary Process . .5.1 Forward and Backward Kalma.n Filtering and the Family of

    Minimal Stationary Realizations of U .....5.2 Finite-Interval Realizatiols.

    6. Estimation, Partial Realization and Balancing6.1 Positivity6.2 The Hilbert Space of a Stationary Signal . . .6.3 ldentification Based on Finite Data6.4 The Partial Realization Problem.6.5 Partial Realization via SVD . ...6.6 Stochastic Ba.lanced Realizations: the Stationary Setting .. . ...6.7 Stochastic Balanced Realizations: the Case of Finite Data. .. . _

    7. The "Subspace Methods" Identification Algorithm ofVan Overscheeand DeMoor.7.1 Choosing Bases in the Predictor Spaces.. .7.2 Skipping some Redundant Steps .. .7.3 The Least Squares Implementation .. ..7.4 Use of the SVD . . . .

    Parameter Estimation of Multivariable Systems UsingBalanced RealizationsJ . M . M a c i e j o w s k i . . . . . . . . . . . 7 0

    t .

    2832

    40434546495156

    5657626364

  • Table of Contents

    2. Prob lem Set t ing . . . . . . . . . T13. Identif iable Parametrizations . .. .. .. ?64. Balanced Paxametrization . . . -.. ... . 7g5 . Some Usefu l C lasses o f Mode ls . . . . . . g l

    5 .1 \4 in imum-Phase Mode ls . . . . . . . 8 i5.2 Positive-Real Models .. . . .. . . .. 83

    6 . O u t l i n e o f P a r a m e t e r E s t i m a r r o n . . . . . . . . . . . . . g 67 . Grad ien t Ca lcu la t ions . . . . . . . . . . . . . 888 . F i n d i n g a n I n i t i a l M o d e l . . . . . . . . . . . 9 1

    8 . 1 A v a i l a b l e M e t h o d s . . . . . . . . . . . . 9 18 .2 Rea l iza t ion Methods . . . . . . - . . . 928 .3 Subspace Methods . . . . . . . . . . . . 988 . 4 G u a r a n t e e i n g S t a b i l i t y . . . . . . . . . 1 0 48 . 5 E s t i m a t i n g n . . . . . . . . . . . . . . . . . 1 0 6

    9 . E x a m p l e s . . . . . . 1 0 79 . 1 D i s t i l l a t i o n C o l u m n . . . . . . . . . . . 1 0 79 . 2 I n d u s t r i a l D r y e r . . . . . . . . . . . . . . 1 1 19 . 3 S e a W a v e S p e c t r u m . . . . . . . . . . . I t 2

    1 0 . C o n c l u s i o n s . . . . . . . . . . . . . 1 7 4

    Balanced Canonical FormsR a i m u n d J . O b e r . . . . . . . . . . . t 2 O

    1 . I n t r o d u c t i o n . . . . . . . . . . . . . 1 2 02. Lyapunov Balanced Realizations and Model Reduction . ........ .. I2b3. A Lyapunov Balanced Cauonical Form for Stable

    C o n t i n u o u s - T i m e S y s t e m s . . . . . . . . . . l 2 g4. L-Characteristic, LQG-Balanced Canonical Form and

    N4ode l Reduct ion fo r Min ima l Sys tems. . . . . . . . 1465. Characteristics, Canonical Forms and Model Reduction

    fol Bounded-Real and Positive-Real Systems . .. .... ... . l t j26 . C o n c l u d i n g R e m a r k s . . . . . . . . . . . . . . 1 7 9

    Flom Data to State ModelPaolo Rapisarda and Jan C. Willems 184

    184187187191193195

    Variable Representations . . ..... 1962.6 Recap i tu la t ion . . . . . . . . . . . . . . _ . 199

    3. From Difference Equation to State Models . . ... . . .. .... 199

    1. Introduction.2. Background

    2-1 Discrete Time Systems2.2 Late\l Variables2.3 State Models ..2.4 Existence a.nd Uniqueness of State Space Models. .2.5 Input/Staie/Output, Output Nulling, and Driving

  • Table of Contents

    3.1 Basic Not ions . . . . . . . 2003 . 2 F r o m K c r n e l R e p r e s e n t a t i o n s t o X ( { ) . . . . - . . . . . . . . 2 0 23.3 From Hybr id Representat ion io X(0 . . . . .2043.4 From Image Representation to X({) ... . .. 2083 . 5 F I o m X ( { ) t o S t a t e E q u a t i o n . . . . . . . . . . . . 2 0 93 . 6 Y ( € ) . . . . . . 2 r r3 . 7 S i m u l a t i o n . . . . . . . . . . 2 1 33 . 8 R e c a p i t u l a t i o n . . . . . . . . . . . . . . . . 2 1 4

    4. The Most Power fu l Unfa ls i f ied Mode l . . . . . . . . .2144 . 1 B a s i c s . . . . . 2 I b4.2 Existence of the Most Powerful Unfalsif ied Model . .. . .. . . .. . . 2164.3 From T ime Ser ies to S ta te Space Mode l . . . . . . . . . . . . 2174.4 Rea l iza t ion Theory as a Spec ia l Case . . . . . . . . . . . . . . 2L94 . 5 R e c a p i t u l a t i o n . . . . . . . . . . . . . . . . 2 2 0

    5 . A l g o r i t h m s . . . . . 2 2 05 . 1 F \ ' o m B e h a v i o r t o S t a t e S p a c c M o d e l . . . . . . . . . . . . . . 2 2 75.2 F \ 'om T ime Ser ies to S ta te Space Mode l I . . . . . . . . . . 2245.3 Common Features and Relative Row Rank . . . ...... 2225 . , 1 F r o m T i m e S e r i e s t o S t a t e M o d e l I i . . . . . . . . . - . . . . . 2 2 95.5 Subspace ldent i f i ca t ion . . , . . - . . 2325.6 Rea l iza t ion Theory as a Spcc ia l Case . . . . . . . . . . . . . . 234

    6 . A p p l o x i m a t e M o d e l i n g . . . . . . . . . . . . 2 3 77 . S i m u l a t i o n s . . . . . . . . . . . . . 2 3 9A . N o t a t i o n - . . . . . . 2 4 J

    Identiflcation of Linear Systerrs from Noisy DataManf red De is t le r and Wol fgang Scher rer . . . . . . . . . 246

    1 . I n t r o d u c t i o n . . . . . . . . . . . . . 2 4 62 . T h e M o d e l . . . . . 2 4 93. The Frisch Case, Bivariate Observations ...... 2544 . T l r e F r i s c h C a s e , G e n e r a l n . . . . . . . . . . . . . . . . . . 2 5 55 . T h e B o u n d e d N o i s e C a s e . . . . . . . . . . 2 6 1

    Identification in I{- : Theory and ApplicationsPramod P. Khargonekar, Guoxiang Gu, and Jonathan Fliedman ......266

    1 . I n t r o d u c t i o n . . . . . . . . . . . . 2 6 62 . P r o b l e m F o r m u l a t i o n . . - - . . . . . . . . . 2 6 7

    2.1 D iscre te-T ime Sys tems . . . . - . . . 2672.2 Exper imenta l Data . . . . . . . . _ . . . 2682.3 Ident i f i ca t ion in f I * . . . . . . . . _ 269

    3 . B a c k g r o u n d R e s u l t s . . . . . . . . . . . . . . . 2 7 04. L incar A lgor i thms. . . . . . . . 27 I

    4 . 1 T h e K e r n e l F u n c t i o r . . . . . . . . . . 2 7 24.2 E l ro r Ana lys is . . . . . . . 21 j

  • Table of Contents

    5 . N o n l i n e a r A l g o l i t h m s . . . . . . . . . . . . . . 2 7 b5 . 1 T w o - S t a g e N o n l i n e a r A l g o r i t h m . . . . . . . . . 2 7 55 . 2 C o n v e x a n d C o n c a v e W i n d o w s . . . . . , . . , . . 2 7 75.3 Flequency Domain Analysis . . . . . . . . . . . . . 2725 . 4 T l a p e z o i d a l W i n d o w . . . . . . . . . . 2 7 8

    6. Engineering Applications. .. .. . . .. .. 27g

    System Identifi cation with Information Theoretic CriteriaA.A. S toorvoge l and J .H. van Schuppen . . . . . . . . 289

    1 . I l t r o d u c t i o n . . . . . . . . . . . . . 2 8 92. Ploblcm Formulation ... .. 2gO3. Approximation with Mutual Information. ....- 2g2

    3 . 1 M u t u a l I n f o f m a t i o n . . . . . . . . . . . 2 g 23.2 A Pa lameter Es t imat ion prob lem . . . . . . . .2g43.3 Relation of Mrrtual Information, I/* Entropy, and LEeG Cost 2953.4 Parameter Estimation with an Exponenfial_of_euaaratic Cost . 2963.5 Parameter Estimation with II_ Ertropy. .. .. . . .. .. 298

    4. Approximation with Likelihood and Divergence . _....... 3004.1 Approximation rvith the Likelihood Function . . . .. .. JOt)4 . 2 D i v e r g e n c e . . . . . . . . . . j 0 14 .3 Re la t ion o f L ike l ihood Funct ion and D ivergence. . . . . . . . . . . . . . 3024.4 Approx imat ion w i th D ivergence . . . . . . . . . . 3044.5 Pararneter Estimation by Divergence Minimization .. . .. . . .. . . 306

    5 . C o n c l u d i n g R e m a r k s . . . . . . . . . . . . . . 3 0 8A. Concepts from Probabitity and the Theory of Stochastic processes . 310

    A.1 Probab i l i t y Concepts . . . . . . . . . . J1{JA . 2 G a u s s i a n R a n d o m V a r i a b l e s . . . - . . . . . . . . . 3 1 0A.3 Concepts f rom the Theory o f S tochas t ic p rocesses . . . . . . . . . . . . J12

    B. Concepts f rom Sys tem Theory . . . . . . . . . . . . . . . 31 i3C. Concepts from Information Theory.. . .. . . .. . . 314D. Information Measures of Gaussian Random Va.riables ,.. . .. . ...... JIzE. Information Measures of Stationary Gaussian processes... .. . . .. . . 320F. LEQG Optimal Stochastic Conrror . .. . . .. . . .. J27G. H-Infinity Control with an Entropy Criterion .. . .. . . .... J32

    Least Squares Based Self-T\rning Control SystemsSerg io B i t tan t i and Marco Campi . . . . . . . . . . . . . . . 3391 . I n t r o d u c t i o n . . . . . . . . . . . . . J j g2 . Se l f - l \n ing Adapt ive Cont ro l . . . . . . .841

    ! Basrc Se l f -T \n ing Concepts . . . . . . . . . . . . . . J41

    2.2 Mathemat ica l F lamework . . . . . . :1423

    -\ lhaj \4akes a ST Control System Nice? . .. . .. .. .. ... . J45

    3.1 Imaginary and Asymptotic Imaginary Systems .. ... ... . ... .. 3453.2 Adaptivc Stabil ization . . . .. . ... :146

  • A

    5 .

    7 .

    Table of Contents XIII

    3 . 3 S e l f - O p t i m a l i t y . . . . . . . . . . . . . . . 3 4 73 . 4 L i s t o f S y m b o l s . . . . . . . . . . . . . . . 3 4 9The Least Squares Identif ication Algorithm ... ... . . .... 350Deterministic Analysis of Self-I\rning Control Systems . . . . . . . . . . . . 3515 . 1 E x c i t a t i o n S u b s p a c e a n d C o n v e r g e n c e o f R L S . . . . . . . . . . . . . . . 3 5 15.2 Adapt ive Stab i l i za t ion and Se l f -Opt ima l i t y . . . . . . . . . . . . . . . . . 3545.3 Spec i f i c Cont ro l Laws . . . . . . . . . 358Stochastic Analysis of Self-T\rning Control Systems. . . . . . . . . . . . . . . 3606.1 Stochastic Excitation Subspace and Convergence of RLS .. . .. 3606.2 Stochastic Adaptive Stabilization and Self-Optimality . . ... . .. 362Concluding Remarks Towards a Theory of Tuning .....363

    On Neural Network Model Structures in SystemIdentificationL. Ljung, J. Sjoberg, and H. Hjalmarsson

    I . ln t roduc t ion and Summary .1.1 What is the Problem?1.2 Black Boxes . . .1.3 Nonlinear Black Box Models .1 . 4 E s r i m c r i n g g v . . . . . .1.5 Properties of the Estimated Model . . .1 .6 Bas is Func t ions1.7 What Is the Neural Network Identif ication Approach? .. . ... ..1-8 Why Have Neural Networks Attracted So Much Interest? ......[.9 Rela t ed Approaches

    2. The Problem2.1 Inferring Relationships from Data.9 9 P r i n r A c e , , m n t i n n q

    2.3 Func t ion C lasses . .3. Some General Estimation Results ..4. The Bias/Variance Trrdp-Off5 . N e u r a l N e t s . . . .

    5 .1 Feedforward Neura l Npts .5 .2 Recur r r .n t Neura l Nets .

    6. Algorith mic Aspects . .6.1 Search Directions6.2 Back-Propagation: Calculation of the Gradient6 .3 lmp l ic i t Rcgu lar izar ion6.4 Ofiline and On-line Algorithms.6 . 5 L o c a l M i n i m a . . - . . . . . .

    7 . Adapt ive Methods .7.1 Adaptive Basis Function Expansion7.2 The "Curse" of Dimensionality7.3 Methods to Avoid the "Curse". .

    366

    366366367368368369369370370371371371372372

    375376376378380380382383384385385385386387

  • XIV 'lrble of Contents

    8. Specific Properties of NN Structures.f. i\4odels of Dynamical Systems Based on Neural Networks

    9.1 A Review of Linear Black Box Models . .9.2 Choice of Regressors for Neural Network Models9.3 Neulal Network Dynamic Models . .9.4 Some Other Structural Questions9.5 Thc Identification Procedure

    An Overview of Computational Learning Theory and ItsApplications to Neural Network TlainingM. Vidyasagar. . . . . . . . . . . . . . 400

    1 . I n t r o d u c t i o n . . . . . . . . . , . . . 4 0 02. P lob lem Formula t ion . . . . . 4013 . S u m m a r y o f K n o w n R e s u l t s . . . . . . . . 4 0 34. Families of Measures with a Nonempty Interior . . ..... .. 4075 . T o t a . l l y B o u n d e d F a m i l i e s o f M e a s u r e s . . . . . . . . . . . . . . _ . . 4 0 96 . T w o S u f l i c i e n t C o n d i t i o n s . . . . . . . . . . 4 7 47. Conclusions . . . . . . . . . . . . . 416A p p e n d i x : A C o u n t e r e x a m p l e . . . . , . , . . . . . . . . . . . 4 1 7

    Just-in-Time Learning and EstimationG e o r g e C y b c n k o . . . . . . . . . . . . 4 2 3

    1 . I n t r o d u c t i o n . . . . . . . . . . . . . 4 2 32 . G l o b a l M o d e l s . . . . . . . . . . . 4 2 53 . L o c a l M o d c l s . . . . . . . . . . . . 4 2 74 . J u s t - I n - T i m e M o d e l s . . . . . . . . . . . . . . 4 2 g

    4.1 Analysis of Just-In-Time Models .. . . . .... 42g4 . 2 D i s c u s s i o n . . . . . - - . . . 4 8 2

    Wavelets in IdentificationA. Benveniste, A. Juditsky, B. Delyon, Q. Zhang, and p-y. Glorennec . 435

    1. Introduction, Motivations, Basic Problems1.1 Two Application Examplcs .

    2. Basic Mathemar ical Problcms3. Classical Methods of Nonlinear System Identification:

    L i n e a r N o n p a r a m e t r i c E s t i m a t o $ . . . . . . . . . . . . 4 4 03.1 Projection Estimates as an Example of Linear

    Nonparametric Estimators ..... 44L3.2 Choice of Model Order, Bandwidth, or Binwidth:

    the Generalized Cross Validation GCV Method. . . ...4. Performance Analysis of the Nonparametric Estimators . . . . . .

    4.1 Lower Bounds lor Best Achievable Performance4.2 Discussion

    388389389391:192393393

    435436438

    442/t A.l

    444446

  • 447

    449449

    450453456458459459460

    462464464465470

    5 .6.

    Table of Contents

    Nonlinear Estimates .Wavelets: What They Are, and Their Use inApploximating Fu ncl,ions6.1 The Continuous Wavelet Ttansform6.2 The Discrete Wavelet Tlansform:

    Orthonormal Bases of Wavelets and Extensions6.3 Wavelcts and Besov Spaces. . .Wavelets: Their Use in Nonparametric Estimation.A Wavelet Network for Practical System Identification .8.1 The Wavelet Network and Its Structure.8.2 Constructing the Wavelet Llbraty W8.3 Selecting Best Wavelet Regressors.Fuzzy Models; Expressing Prior Information inNonlinear NonparameLric Models ..

    10. Experimental Results . .10.1lr4odell ing the Gas T\rrbine System..10 .2 Mode l l ing the Hydrau l i c Ac tua tor o f the Robot Arm . . . . . . . . .

    . l I. Discussion and Conclusions

    Frtzzy Logic Modelling and Control

    M o t i v a t i o n . . . . . 4 7 9Fuzzy Logic Basic Concepts .. ... . .. 4812. I F\zzy Logic Var iables . . . . . . . . . 4822.2 Frzzy Logic Operat ions. . . . . . . . 4852.3 Approximated Reasoning . . . . . . 4862 . 4 D e f u z z l f i . c a t i o n . . . . . . . . . , , . . , . . 4 9 02.5 Fuzzy Systems. .. . . .. 492F u z z y L o g i c C o n t r o l l e r S t r u c t u r e . . . . . . . . . . . . . 4 9 43 . 1 F u z z i f i e r . . . . . . . . . . . . 4 9 53.2 Fttzzy Operator . .. . .. 4953 . 3 D e f u z z i f i c a t i o n . . . . . . . - - . . . - . . . 4 9 7F L C A n a l y s i s . . . . . . . . . . . 4 9 74.7 Ft,zzy Systems Approximation Properties .. .... ... . 499F u z z y L o g i c C o n t r o l l e r s D e s i g n . . . . . . . . . . . . . . 5 0 05.1 Experimenta.l FLC Design. ..... 5015 . 2 F L C f r o m L i n e a r O n e s . . . . - . . . . 5 0 25.3 FLC Supervision. . . . . . . . . . . . . . 503Process Fuzzy Model l ing . . . . . . . . . . . 504Cement Kiln Control . . . . . . . . . . . . . . 5057.1 Process Descr ip t ion . . . . . . . . . . . 5067 . 2 C o n t r o l S t r u c t u r e , , . , , , . . . , . . . 5 0 7

    6.7.

  • XVI lable of Contents

    Searching for the Best: StoclAnnealing and Relared pro"::",;ti:"App"oximation,

    simulated

    G e o r g P f l u g . . . . . . . 5 1 4

    L i s t o f C o n t r i b u t o r s . . . . . . . . . . .. . . . . . . 5 5 0