towards a theoretical towards a theoretical framework for lcs framework for lcs

23
Towards a Theoretical Towards a Theoretical Framework for LCS Framework for LCS Jan Jan Drugowitsch Drugowitsch Dr Alwyn Barry Dr Alwyn Barry

Upload: xavier-llora

Post on 26-Dec-2014

482 views

Category:

Technology


1 download

DESCRIPTION

Alwyn Barry introduces the theoretical framework for LCS that Jan Drugowitsch is currently working on.

TRANSCRIPT

Page 1: Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework for LCS

Towards a Theoretical Towards a Theoretical

Framework for LCSFramework for LCS

Jan Jan DrugowitschDrugowitsch

Dr Alwyn BarryDr Alwyn Barry

Page 2: Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework for LCS

May 2006May 2006 ©© A.M.BarryA.M.Barry and J. and J. DrugowitschDrugowitsch, 2006, 2006

OverviewOverviewOverview

Big Question: Why use LCS?Big Question: Why use LCS?

1.1. Aims and ApproachAims and Approach

2.2. Function ApproximationFunction Approximation

3.3. Dynamic ProgrammingDynamic Programming

4.4. Classifier ReplacementClassifier Replacement

5.5. Future WorkFuture Work

Page 3: Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework for LCS

May 2006May 2006 ©© A.M.BarryA.M.Barry and J. and J. DrugowitschDrugowitsch, 2006, 2006

Research AimsResearch AimsResearch Aims

� To establish a formal model for

Accuracy-based LCS

� To use established models, methods

and notation

� To transfer method knowledge (analysis

/ improvement)

� To create links to other ML approaches

and disciplines

Page 4: Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework for LCS

May 2006May 2006 ©© A.M.BarryA.M.Barry and J. and J. DrugowitschDrugowitsch, 2006, 2006

ApproachApproachApproach

�� Reduce LCS to components:Reduce LCS to components:– Function Approximation

– Dynamic Programming

– Classifier Replacement

�� Model the componentsModel the components

�� Model their interactionsModel their interactions

�� Verify models by empirical studiesVerify models by empirical studies

�� Extend models from related domainsExtend models from related domains

Page 5: Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework for LCS

May 2006May 2006 ©© A.M.BarryA.M.Barry and J. and J. DrugowitschDrugowitsch, 2006, 2006

Function ApproximationFunction ApproximationFunction Approximation

� Architecture Outline:

– Set of K classifiers, matching states Sk⊆ S

– Every classifier k approximates the value function V over Sk. independently.

� Assumptions

– Function V : S → ℜ is time-invariant

– Time-invariant set of classifiers { S1, …, SK}

– Fixed sampling dist. ∏(S) given by π : S → [0, 1]

– Minimising MSE: ∑∈

−kSi

k iViVi 2~

))()()((π

Page 6: Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework for LCS

May 2006May 2006 ©© A.M.BarryA.M.Barry and J. and J. DrugowitschDrugowitsch, 2006, 2006

Function ApproximationFunction ApproximationFunction Approximation

� Linear approx. architecture

– Feature vector φ : S → ℜL

– Approximation parameters vector ωk ∈ ℜL

– Approximation

� Averaging classifiers, φ (i) = (1), ∀i∈S

� Linear estimators, φ (i) = (1, i)′, ∀i∈S

kT

k iV ωφ )(~

=

Page 7: Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework for LCS

May 2006May 2006 ©© A.M.BarryA.M.Barry and J. and J. DrugowitschDrugowitsch, 2006, 2006

Function ApproximationFunction ApproximationFunction Approximation

�� Approximation to MSE for Approximation to MSE for kk at time at time tt ::

–– Matching:Matching:

–– Match count:Match count:

�� Since the sequence of states Since the sequence of states iimm is dependant is dependant

on on π, the MSE is automatically weighted by this distribution.

( ) ( ) ( )( )2,

0,, 1

1mtkm

t

mmS

tktk iiViI

c kφωε ′−

−= ∑

=

( )m

t

mStk iIc

k∑=

=0

,

{ }1,0: →SIkS

Page 8: Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework for LCS

May 2006May 2006 ©© A.M.BarryA.M.Barry and J. and J. DrugowitschDrugowitsch, 2006, 2006

Function ApproximationFunction ApproximationFunction Approximation

�� Mixing classifiersMixing classifiers–– A classifier A classifier kk covers covers SSkk ⊆⊆ SS–– Need to Need to mixmix estimates to recoverestimates to recover

–– Requires mixing parameter (will be Requires mixing parameter (will be ‘‘accuracyaccuracy’’ ... see later!)... see later!)

–– ψψkk((ii)) = 0= 0 where where ii ∉∉ SSkk …… classifiers that do classifiers that do not match do not contributenot match do not contribute

)()()(1

~

iiiV k

K

kk φωψ ′=∑

=

{ } ( ) 1,1,0:1

=→ ∑=

K

kkk iS ψψwhere

( )iV~

Page 9: Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework for LCS

May 2006May 2006 ©© A.M.BarryA.M.Barry and J. and J. DrugowitschDrugowitsch, 2006, 2006

Function ApproximationFunction ApproximationFunction Approximation

�� Matrix notationMatrix notation

=

=

′=Φ

′=

)(,,0

0,),1(

)(,,0

0,),1(

)(

)1(

))(,),1((

NI

I

I

N

D

N

NVVV

k

k

k

S

S

S

K

O

K

K

O

K

LL

L

LL

L

π

πφ

φ

∑∑==

ΦΨ=Ψ=

Φ=

=

K

kkkk

K

kk

k

k

k

kk

Sk

VV

N

V

DIDk

11

~~

)(,,0

0,),1(

~

ω

ψ

ψω

K

O

K

Page 10: Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework for LCS

May 2006May 2006 ©© A.M.BarryA.M.Barry and J. and J. DrugowitschDrugowitsch, 2006, 2006

Function ApproximationFunction ApproximationFunction Approximation

�� AchievementsAchievements

–– Proof that the MSE of a classifier Proof that the MSE of a classifier kk at time at time t t is the is the

normalised sample variance of the value normalised sample variance of the value

functions over the states matched until time functions over the states matched until time t t

–– Proof that the expression developed for sampleProof that the expression developed for sample--

based approximated MSE is a suitable based approximated MSE is a suitable

approximation for the actual MSE, for finite state approximation for the actual MSE, for finite state

spacesspaces

–– Proof of optimality of approximated MSE from the Proof of optimality of approximated MSE from the

vectorvector--space viewpoint (relates classifier error to space viewpoint (relates classifier error to

the Principle of the Principle of OrthogonalityOrthogonality))

Page 11: Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework for LCS

May 2006May 2006 ©© A.M.BarryA.M.Barry and J. and J. DrugowitschDrugowitsch, 2006, 2006

Function ApproximationFunction ApproximationFunction Approximation

�� Achievements (contAchievements (cont’’d)d)–– Elaboration of algorithms using the model:Elaboration of algorithms using the model:

�� Steepest Gradient DescentSteepest Gradient Descent

�� Least Mean SquareLeast Mean Square–– Show sensitivity to step size and input range hold in LCSShow sensitivity to step size and input range hold in LCS

�� Normalised LMSNormalised LMS

�� KalmanKalman FilterFilter–– …… and its inverse coand its inverse co--variance formvariance form

–– Equivalence to Recursive Least SquaresEquivalence to Recursive Least Squares

–– Derivation of computationally cheap implementationDerivation of computationally cheap implementation

–– For each, derivation of:For each, derivation of:�� Stability criteriaStability criteria

–– For LMS, identification of the excess mean squared For LMS, identification of the excess mean squared estimation errorestimation error

�� Time constant boundsTime constant bounds

Page 12: Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework for LCS

May 2006May 2006 ©© A.M.BarryA.M.Barry and J. and J. DrugowitschDrugowitsch, 2006, 2006

Function ApproximationFunction ApproximationFunction Approximation

�� Achievements (contAchievements (cont’’d)d)

–– Investigation:Investigation:

Page 13: Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework for LCS

May 2006May 2006 ©© A.M.BarryA.M.Barry and J. and J. DrugowitschDrugowitsch, 2006, 2006

Function ApproximationFunction ApproximationFunction Approximation

�� How to set the mixing weights How to set the mixing weights ψψkk((ii))??–– AccuracyAccuracy--based mixing (YCSbased mixing (YCS--like!)like!)

–– Divorce mixing for fn. approx. from fitnessDivorce mixing for fn. approx. from fitness

–– Investigating power factor Investigating power factor νν ∈ℜ∈ℜ++

–– Demonstrate, using the model, that we Demonstrate, using the model, that we

obtain the MLE of the mixed classifiers obtain the MLE of the mixed classifiers

when when νν = = 11

∑=

= K

ppS

kSk

iI

iIi

p

k

1

)(

)()(

ν

ν

ε

εψ

Page 14: Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework for LCS

May 2006May 2006 ©© A.M.BarryA.M.Barry and J. and J. DrugowitschDrugowitsch, 2006, 2006

Function ApproximationFunction ApproximationFunction Approximation

�� Mixing Mixing -- Empirical ResultsEmpirical Results

–– Higher Higher νν, more importance to low error, more importance to low error

–– FoundFoundνν = 1 best compromise= 1 best compromise

Classifiers: 0..π/2, π/2.. π, 0.. π Classifiers: 0..0.5, 0.5.. 1, 0.. 1

Page 15: Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework for LCS

May 2006May 2006 ©© A.M.BarryA.M.Barry and J. and J. DrugowitschDrugowitsch, 2006, 2006

Dynamic ProgrammingDynamic ProgrammingDynamic Programming

�� Dynamic Programming:Dynamic Programming:

�� BellmanBellman’’s Equation:s Equation:

�� Applying the DP operator Applying the DP operator T T to value est.to value est.

( ) ( )

=== ∑∞

=+

0t01

t* )(,|,,Emax ttttt iaiiiairiV µγµ

( ) ( )( )),(|)(,,Emax ** aipjjVjairiVAa

=+=∈

γ

( ) ( )( )),(|)(,,Emax)( aipjjVjairiTVAa

=+=∈

γ

Page 16: Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework for LCS

May 2006May 2006 ©© A.M.BarryA.M.Barry and J. and J. DrugowitschDrugowitsch, 2006, 2006

Dynamic ProgrammingDynamic ProgrammingDynamic Programming

�� TT is a contraction is a contraction w.r.tw.r.t. maximum norm. maximum norm

–– Converges to unique fixed point Converges to unique fixed point VV**= TV= TV**

�� With function approximation:With function approximation:

–– Approximation after every update:Approximation after every update:

–– Might not converge for more complex Might not converge for more complex

approximation architecturesapproximation architectures

�� Is LCS function approximation a nonIs LCS function approximation a non--

expansion expansion w.r.tw.r.t. maximum norm?. maximum norm?

( )tt VTfV~~

1 =+

Page 17: Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework for LCS

May 2006May 2006 ©© A.M.BarryA.M.Barry and J. and J. DrugowitschDrugowitsch, 2006, 2006

Dynamic ProgrammingDynamic ProgrammingDynamic Programming

�� LCS Value IterationLCS Value Iteration

–– Assume:Assume:

�� constant populationconstant population

� averaging classifiers, φ (i) = (1), ∀i∈S

– Classifier is approximating:

– Based on minimising MSE:

( ) 2

1,

2

1,1

2

1,1

~~~)(

~)(

kSkSk

ItktItktSi

tkt VVTVViViV +++∈

++ −=−=−∑

tt VTV~

1 =+

Page 18: Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework for LCS

May 2006May 2006 ©© A.M.BarryA.M.Barry and J. and J. DrugowitschDrugowitsch, 2006, 2006

Dynamic ProgrammingDynamic ProgrammingDynamic Programming

�� LCS Value Iteration (contLCS Value Iteration (cont’’d)d)

–– Minimum is Minimum is orthogorthog. projection on approx:. projection on approx:

–– Need also to determine new mixing weight Need also to determine new mixing weight

by evaluating the new approx. error:by evaluating the new approx. error:

–– ∴∴the only time dependency is onthe only time dependency is on

–– Thus:Thus:

tV~

tItItk VTVVkSkS

~~11, Π=Π= ++

( ) 22

1,11,

~)Tr(

1~)Tr(

1

kSk

kkS

k

ItDS

ItktS

tk VTII

VVI

Π−=−= +++ε

∑=

++ ΠΨ=K

ktItkt VTV

kS1

1,1

~~

Page 19: Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework for LCS

May 2006May 2006 ©© A.M.BarryA.M.Barry and J. and J. DrugowitschDrugowitsch, 2006, 2006

Dynamic ProgrammingDynamic ProgrammingDynamic Programming

�� LCS Asynchronous Value IterationLCS Asynchronous Value Iteration

–– Only update Only update VVkk for the matched statefor the matched state

–– Minimise:Minimise:

–– Thus, approx. costs weighted by state dist Thus, approx. costs weighted by state dist

( )∑=

′−t

mmtkmmmS iiVTiI

k0

2

, )())(~

()( φω

( )

=++

+

ΠΨ=

Π=∴

Φ−=′−

K

ktDtkt

tDtk

t

SiDkk

VTV

VTV

VTiiVTi

k

k

kk

11,1

1,

22

~~

~~

~)())(

~()( ωφωπ

Page 20: Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework for LCS

May 2006May 2006 ©© A.M.BarryA.M.Barry and J. and J. DrugowitschDrugowitsch, 2006, 2006

Dynamic ProgrammingDynamic ProgrammingDynamic Programming

�� Also derived:Also derived:–– LCS QLCS Q--LearningLearning

�� LMS ImplementationLMS Implementation

�� KalmanKalman Filter ImplementationFilter Implementation

�� Demonstrate that the weight update uses the Demonstrate that the weight update uses the normalised form of local gradient descent normalised form of local gradient descent

–– ModelModel--based Policy Iterationbased Policy Iteration

–– StepStep--wise Policy Iterationwise Policy Iteration

–– TD(TD(λλ) is not possible in LCS due to re) is not possible in LCS due to re--evaluation of mixing weightsevaluation of mixing weights

∑=

+ ΠΨ=K

ktDkt VTV

k1

1

~~µ

Page 21: Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework for LCS

May 2006May 2006 ©© A.M.BarryA.M.Barry and J. and J. DrugowitschDrugowitsch, 2006, 2006

Dynamic ProgrammingDynamic ProgrammingDynamic Programming

�� Is LCS function approximation a nonIs LCS function approximation a non--expansion expansion w.r.tw.r.t. . maximum norm?maximum norm?– We derive that:

� Constant Mixing: Yes� Arbitrary Mixing: Not necessarily� Accuracy-based Mixing: non-expansion (dependant upon a

conjecture)

�� Is LCS function approximation a nonIs LCS function approximation a non--expansion expansion w.r.tw.r.t. . weighted norm?weighted norm?– We know Tµ is a contraction– We show ΠDk is a non-expansion– We demonstrate:

� Single Classifier: convergence to a fixed pt� Disjoint Classifiers: convergence to a fixed pt� Arbitrary Mixing: spectral radius can be ≥ 1 even in some

fixed weightings

Page 22: Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework for LCS

May 2006May 2006 ©© A.M.BarryA.M.Barry and J. and J. DrugowitschDrugowitsch, 2006, 2006

Classifier ReplacementClassifier ReplacementClassifier Replacement

� How can we define the optimal population formally?

– What do we want to reach?

– How does the global fitness of one classifier differ from its local fitness (i.e. accuracy)?

– What is the quality of a population?

� Methodology

– Examine methods to define the optimal population

– Relate to generalised hill-climbing

– Map to other search criteria

Page 23: Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework for LCS

May 2006May 2006 ©© A.M.BarryA.M.Barry and J. and J. DrugowitschDrugowitsch, 2006, 2006

Future WorkFuture WorkFuture Work

� Formalising and analysing classifier replacement

� Interaction of replacement with function approximation

� Interaction of replacement with DP� Handling stochastic environments� Error bounds and rates of convergence� …

Big Answer: LCS is an integrative framework,

…… But we need the framework definition!But we need the framework definition!