oneiric machine learning - international center of...

32
Oneiric Machine Learning The Foundations of Dream Inspired Adaptive Systems Julian Holley Faculty of Computing, Engineering and Mathematic Sciences University of the West of England A thesis submitted in partial fulfilment of the requirements of the University of the West of England, Bristol for the degree of Doctor of Philosophy November 2008

Upload: ngotuyen

Post on 13-Jun-2018

238 views

Category:

Documents


0 download

TRANSCRIPT

Oneiric Machine

Learning

The Foundations of Dream Inspired Adaptive Systems

Julian Holley

Faculty of Computing, Engineering and Mathematic Sciences

University of the West of England

A thesis submitted in partial fulfilment of the requirements of theUniversity of the West of England, Bristol for the degree of

Doctor of Philosophy

November 2008

For Sarah & Elliot

Acknowledgements

Foremost I would like to thank my supervisory team, Director of Stud-

ies Dr. Tony Pipe and Dr. Brian Carse. I would especially like to

emphasise their enthusiasm, understanding, support and friendship

over a protracted and turbulent period in which I have worked on

this study.

Secondly I would like to thank the general manager Mr. Alan Bromley

& former manager and director Abdul Basharat of Clares Merchandise

Handling Equipment Ltd. my former employer and sponsor. I would

especially like to acknowledge how I have been able to restore the

balance between work commitments and my personal aspirations as

a researcher in recent months. I would also like to express my thanks

to Professor Larry Bull; responsible for my recent employment and

for his support during the final stages.

Thirdly, the degree to which the semantic clarity and comprehension

of the thesis has been improved, I owe many thanks to Anthony Chan-

dor. Any remaining errors are indicative of work not to pass under

his critical eye and therefore I alone am responsible.

Finally and with due humility for the space in her life that I have

selfishly stolen for sake of my own personal quest for understanding,

I thank Sarah.

Copyright Notice

This copy has been supplied on the understanding that it

is copyright material and that no quotation from the thesis

may be published without proper acknowledgement.

Abstract

Artificial adaptive systems inspired or derived from neuro-biological

components and processes have shown great promise at several lev-

els. One behaviour required for the continuous functional operation

of advanced neuro-biological systems is sleep. A definitive function or

purpose for sleep and of the associated phenomenology such as dream-

ing, remains elusive. Correspondingly there remain many unresolved

issues within the domain of artificial learning systems. One such as-

pect that largely remains intractable is the management of experiences

once learned and encoded. This is the general problem of developing a

persuasive explanation or scalable strategy for the contiguous organi-

sation of internal representation and memory within finite resources;

it is from this parallel perspective in which this research is set.

This research is an exploration into the cognition of sleep and dream-

ing in humans and animals. Positioned between sleep & dreaming

research and the machine learning domain, this thesis reports on an

approach to improve the latter by formulating theories emerging from

the former.

Recent research investigating the responsibility of sleep processes in

modifying memory have shown that for the avian and mammalian

brain sleep plays an important role in long term cognitive develop-

ment. A set of observations are created from the current understand-

ing of both the benefits of sleep and the processes involved, including

dreaming. From these observations the first contribution of this thesis

is presented; several proposals for the cognitive benefits of sleep and

dreaming in aspects of perception, consolidation, scalability, general-

isation and representational conceptualisation.

Previous research has investigated some aspects of sleep and dreaming

in relation to machine learning. These have been positioned at two

extremes of the machine learning paradigm; low level, emergent be-

haviour of artificial neural networks or high level, directed behaviour

of symbolic artificial intelligence. This is the first report of direct re-

search into the translation of the benefits by analogous mechanisms

of sleep and dreaming at a level in-between earlier research. This

combination is characterised by creating a foundation for a new genre

of artificial learning strategies derived directly from sleep and dream

phenomenology, Oneiric Machine Learning.1

Anticipatory classifier systems (ACS) represent a niche group of ma-

chine learning systems derived from the established machine learning

field of learning classifier systems (LCS). ACS are capable of latent

learning; learning for the reward of learning and subsequently creat-

ing an internal generalised model of the environment. This feature

aligned within the LCS framework provides an ideal developmental

template. A review of the latent learning background and ACS al-

gorithmic detail sets the basis for several applications illustrative of

the Oneiric Machine Learning approach. Empirical evidence demon-

strates how an adapted ACS system can exploit a dreamlike emergent

thread based on an incomplete, generalised model of the environment

to reduce the number of real actions required to reach model compe-

tency. Conceptual solutions to restrictions limiting the role to which

ACS/LCS systems can represent some aspects advocated by Oneiric

Machine Learning are presented.

In mitigation of these restrictions, two novel prototype systems are

described; the first introduces a method of implicitly managing state

generalisation by the building of concept links into the classifier rule.

The second illustrates automatic state alias triggered state augmenta-

tion and off-line resolution. Although remaining under development

1Oneiric: of or relating to dreams or dreaming. Adapted from Oneiric Behaviour (Jouvet,1979) used to describe rapid eye movement (REM) sleep re-animation.

results in these new directions present plausible systems level architec-

tures that are in part experimentally demonstrated. Novel solutions

are presented to structural and procedural problems that promote the

future development of cognitive systems within the LCS framework

setting a direction for future studies.

Contents

1 Introduction 1

1.1 Paradox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Nature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.3 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.4 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.5 Thesis Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.5.1 Part I: Sleep, Dreaming & Learning . . . . . . . . . . . . . 11

1.5.2 Part II: Latent & Machine Learning . . . . . . . . . . . . . 12

1.5.3 Part III: Heterogenesis . . . . . . . . . . . . . . . . . . . . 12

1.5.3.1 Hypothesis . . . . . . . . . . . . . . . . . . . . . 12

1.5.3.2 Consolidation . . . . . . . . . . . . . . . . . . . . 13

1.5.3.3 Generalisation . . . . . . . . . . . . . . . . . . . 13

1.5.3.4 State Augmentation . . . . . . . . . . . . . . . . 14

1.6 Audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

I Sleep, Dreaming and Learning 16

2 Foundations of Sleep and Dreaming 17

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.2 Stages of Sleep . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.2.1 Slow Wave Sleep Stage . . . . . . . . . . . . . . . . . . . . 23

2.2.2 REM Sleep Stage . . . . . . . . . . . . . . . . . . . . . . . 23

2.3 Physiology of Sleep . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.3.1 Endocrine System . . . . . . . . . . . . . . . . . . . . . . . 24

i

CONTENTS

2.3.2 Thermoregulation . . . . . . . . . . . . . . . . . . . . . . . 25

2.3.3 Respiratory System . . . . . . . . . . . . . . . . . . . . . . 25

2.3.4 Autonomic System . . . . . . . . . . . . . . . . . . . . . . 25

2.4 Ontogenesis of Sleep . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.5 Neurophysiology of Sleep and Dreaming . . . . . . . . . . . . . . 27

2.5.1 State Transition . . . . . . . . . . . . . . . . . . . . . . . . 28

2.5.1.1 Waking to Slow Wave Sleep . . . . . . . . . . . . 28

2.5.1.2 Slow Wave Sleep to REM Sleep . . . . . . . . . . 29

2.5.1.3 Sleep to Waking . . . . . . . . . . . . . . . . . . 30

2.5.2 Waking . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.5.3 Slow Wave Sleep . . . . . . . . . . . . . . . . . . . . . . . 32

2.5.4 REM Sleep . . . . . . . . . . . . . . . . . . . . . . . . . . 34

2.6 Neurological Summary of Sleep . . . . . . . . . . . . . . . . . . . 38

2.6.1 Hippocampus . . . . . . . . . . . . . . . . . . . . . . . . . 47

2.6.2 Neurology of Memory . . . . . . . . . . . . . . . . . . . . . 50

2.7 Mentation During Sleep . . . . . . . . . . . . . . . . . . . . . . . 54

2.7.1 Hypnagogic Sleep . . . . . . . . . . . . . . . . . . . . . . . 54

2.7.2 Lucid Dreaming . . . . . . . . . . . . . . . . . . . . . . . . 55

2.7.3 Daydreaming . . . . . . . . . . . . . . . . . . . . . . . . . 55

2.8 Sleep and Dreaming in Animals . . . . . . . . . . . . . . . . . . . 56

2.8.1 Dreaming Cats . . . . . . . . . . . . . . . . . . . . . . . . 56

2.8.2 Dreaming Rats . . . . . . . . . . . . . . . . . . . . . . . . 60

2.8.3 Dreaming Birds . . . . . . . . . . . . . . . . . . . . . . . . 65

2.9 Sleep and Dreaming in Humans . . . . . . . . . . . . . . . . . . . 70

2.9.1 Human Dreams . . . . . . . . . . . . . . . . . . . . . . . . 70

2.9.2 Correlates of Dreaming . . . . . . . . . . . . . . . . . . . . 71

2.10 Memory and Sleep . . . . . . . . . . . . . . . . . . . . . . . . . . 73

2.11 Isomorphism, Creativity and Insight . . . . . . . . . . . . . . . . . 85

2.11.1 Antithesis Sleep and Memory Adaptation . . . . . . . . . . 92

2.12 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

ii

CONTENTS

II Latent Learning 98

3 Latent Learning: A Review 99

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

3.2 Latent Learning: an Historical Road Map . . . . . . . . . . . . . 99

3.3 Selected Reviews . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

3.3.1 The Effect of Introduction of Reward Upon the Maze . . . 101

3.3.2 Purposive Behaviour in Animals and Men . . . . . . . . . 101

3.3.3 A Theoretical Derivation of Latent Learning . . . . . . . . 101

3.3.4 An Experimental Analysis of Latent Learning . . . . . . . 101

3.3.4.1 Review . . . . . . . . . . . . . . . . . . . . . . . 102

3.3.4.2 The Experiment . . . . . . . . . . . . . . . . . . 102

3.3.4.3 Disscusion . . . . . . . . . . . . . . . . . . . . . . 103

3.3.5 Unrewarded Exploration and Maze Learning . . . . . . . . 104

3.3.6 Latent Learning Impaired by REM Sleep Deprivation . . . 104

3.3.7 Lookahead Planning and Latent Learning in LCS . . . . . 105

3.3.8 Anticipatory Classifier Systems . . . . . . . . . . . . . . . 106

3.3.9 Latent Learning in Khepera Robots with the ACS . . . . . 107

3.4 Latent Learning: Learning Without Reward . . . . . . . . . . . . 108

3.4.1 Animal Psychology . . . . . . . . . . . . . . . . . . . . . . 108

4 Related Systems 110

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

4.2 The General Anticipatory Classifier System Framework . . . . . . 111

4.3 CFSC2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

4.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

4.3.2 Classifier System Architecture . . . . . . . . . . . . . . . . 115

4.3.3 Representing an Internal Model . . . . . . . . . . . . . . . 119

4.3.4 Learning a Model . . . . . . . . . . . . . . . . . . . . . . . 121

4.3.5 Rule Adaptation . . . . . . . . . . . . . . . . . . . . . . . 123

4.3.6 Summary and Review CFSC2 . . . . . . . . . . . . . . . . 123

4.4 Original ACS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

4.4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

iii

CONTENTS

4.4.2 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 126

4.4.3 Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

4.4.4 Latent Learning . . . . . . . . . . . . . . . . . . . . . . . . 128

4.4.4.1 Prediction Quality Adjustment . . . . . . . . . . 131

4.4.4.2 The ‘Specification of Changing Components’ . . . 132

4.4.4.3 The ‘Specification of Un-changing Components’ . 133

4.4.5 Behavioural Sequencing ‘Chunking’ . . . . . . . . . . . . . 133

4.4.6 ACS Summary and Review . . . . . . . . . . . . . . . . . 135

4.5 YACS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

4.5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

4.5.2 Latent Learning . . . . . . . . . . . . . . . . . . . . . . . . 138

4.5.2.1 Effect Covering . . . . . . . . . . . . . . . . . . . 138

4.5.2.2 Selection of Accurate Classifiers . . . . . . . . . . 139

4.5.2.3 Specification of Conditions . . . . . . . . . . . . . 139

4.5.2.4 Specialisation Process . . . . . . . . . . . . . . . 141

4.5.2.5 Condition Covering and Useless Classifiers . . . . 141

4.5.3 Policy Learning . . . . . . . . . . . . . . . . . . . . . . . . 141

4.5.3.1 YACS Notation . . . . . . . . . . . . . . . . . . . 142

4.6 Value Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

4.7 Modular Classifier System (MACS) . . . . . . . . . . . . . . . . . 144

4.7.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

4.7.2 Latent Learning and Generalisation . . . . . . . . . . . . . 146

4.7.3 Exploration and Exploitation Policies . . . . . . . . . . . . 150

4.7.3.1 Active Exploration . . . . . . . . . . . . . . . . . 151

4.7.3.2 Exploitation . . . . . . . . . . . . . . . . . . . . . 154

4.7.3.3 Combining Exploration and Exploitation . . . . . 155

4.8 Model-based Reinforcement Learning . . . . . . . . . . . . . . . . 156

4.9 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

III Heterogenesis 160

5 Hypothesis 161

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

iv

CONTENTS

5.2 Aspects of Sleep and Dreaming . . . . . . . . . . . . . . . . . . . 162

5.2.1 General Aspects of Sleep . . . . . . . . . . . . . . . . . . . 162

5.2.2 Specific Aspects of Sleep . . . . . . . . . . . . . . . . . . . 164

5.2.3 Learning and Neural Adaptivity . . . . . . . . . . . . . . . 166

5.2.4 Dreaming and Delusion . . . . . . . . . . . . . . . . . . . . 166

5.2.5 Monotonicity of Cognition . . . . . . . . . . . . . . . . . . 168

5.2.6 Dreaming and Internal Representation . . . . . . . . . . . 169

5.2.7 A Closed System . . . . . . . . . . . . . . . . . . . . . . . 171

5.2.8 The Pressure of Sleep . . . . . . . . . . . . . . . . . . . . . 172

5.2.9 Emotions and Dreaming . . . . . . . . . . . . . . . . . . . 173

5.2.10 Periodicity of Waking and Sleep . . . . . . . . . . . . . . . 174

5.2.11 Hyperassociativity of Dreaming . . . . . . . . . . . . . . . 174

5.2.12 The Thread of Dreaming . . . . . . . . . . . . . . . . . . . 176

5.2.13 Temporal Relevancy of Dream Content . . . . . . . . . . . 177

5.2.14 The Progression of Sleep Mentation . . . . . . . . . . . . . 178

5.2.15 Development of Dreaming . . . . . . . . . . . . . . . . . . 179

5.2.16 Periodicity of SWS and REM sleep . . . . . . . . . . . . . 179

5.2.17 The Hippocampus Neocortex Dialogue During Sleep . . . . 181

5.2.18 Summary of Observations . . . . . . . . . . . . . . . . . . 186

5.3 Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

5.3.1 Suggestion . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

5.3.2 Reiteration . . . . . . . . . . . . . . . . . . . . . . . . . . 192

5.3.3 Capacity Constraint Theory . . . . . . . . . . . . . . . . . 194

5.3.4 Long and Short Memory Assimilation . . . . . . . . . . . . 196

5.3.4.1 Long Term Behaviour in Short Term Context . . 200

5.3.4.2 Short Term Behaviour in Long Term Context . . 201

5.3.5 Concept Building . . . . . . . . . . . . . . . . . . . . . . . 205

5.3.6 Generalisation: The Building of Concepts . . . . . . . . . 209

5.3.7 On-line or Off-line Adaptation? . . . . . . . . . . . . . . . 211

5.3.8 Structural and Temporal Credit Assignment . . . . . . . . 217

5.4 An Integrative View . . . . . . . . . . . . . . . . . . . . . . . . . 218

5.4.1 Formal Definition of Oneiric Machine Learning . . . . . . . 226

5.5 The Machine Learning Link . . . . . . . . . . . . . . . . . . . . . 229

v

CONTENTS

6 Consolidation 231

6.1 Simulating Dreaming . . . . . . . . . . . . . . . . . . . . . . . . . 232

6.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239

6.3 Bounded Walk . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242

6.3.1 Results: Bounded Walk, Latent Learning Phase . . . . . . 247

6.3.2 Result Analysis: Bounded Walk . . . . . . . . . . . . . . . 257

6.3.3 Discussion: Bounded Walk . . . . . . . . . . . . . . . . . . 259

6.4 T-Maze . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262

6.4.1 Result Analysis: T-maze . . . . . . . . . . . . . . . . . . . 266

6.5 Critique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267

6.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267

7 Generalisation 269

7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269

7.2 Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270

7.2.1 Unit of Representation: Classifier Configuration . . . . . . 272

7.2.2 Unit of Representation: Classifier Structure . . . . . . . . 273

7.2.3 Operation: Action Selection . . . . . . . . . . . . . . . . . 273

7.3 Generalised Example . . . . . . . . . . . . . . . . . . . . . . . . . 276

7.3.1 Conceptual Example . . . . . . . . . . . . . . . . . . . . . 276

7.3.2 Detailed Example . . . . . . . . . . . . . . . . . . . . . . . 279

7.3.3 Sequence (Run 1) . . . . . . . . . . . . . . . . . . . . . . . 282

7.3.4 Commentary (Run 1) . . . . . . . . . . . . . . . . . . . . . 283

7.3.5 Sequence (Run 2) . . . . . . . . . . . . . . . . . . . . . . . 290

7.3.6 Commentary (Run 2) . . . . . . . . . . . . . . . . . . . . . 291

7.3.7 Example Notes . . . . . . . . . . . . . . . . . . . . . . . . 294

7.4 Implementation: System I . . . . . . . . . . . . . . . . . . . . . . 295

7.5 Implementation: System II . . . . . . . . . . . . . . . . . . . . . . 303

7.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303

vi

CONTENTS

8 State Augmentation 305

8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305

8.1.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . 306

8.2 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309

8.2.1 Proposition Method . . . . . . . . . . . . . . . . . . . . . 319

8.3 Further State Adaptation . . . . . . . . . . . . . . . . . . . . . . 326

8.3.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 331

9 Conclusions 333

9.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333

9.2 Related Systems and Context . . . . . . . . . . . . . . . . . . . . 335

9.2.1 Sleep and Dream Related Systems . . . . . . . . . . . . . . 336

9.2.2 Non Sleep Related Systems . . . . . . . . . . . . . . . . . 338

9.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341

9.3.1 Foundation Contributions . . . . . . . . . . . . . . . . . . 341

9.3.2 Application Contributions . . . . . . . . . . . . . . . . . . 341

9.4 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345

9.5 Closing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346

A 348

A.1 Bounded Walk . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348

A.2 T-maze . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351

A.3 Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358

References 388

vii

List of Figures

1.1 Parallel research approach . . . . . . . . . . . . . . . . . . . . . . 10

2.1 Human stages of sleep . . . . . . . . . . . . . . . . . . . . . . . . 22

2.2 Principal lobes of the human cerebral cortex . . . . . . . . . . . . 40

2.3 Relative changes to cortical activity during REM sleep (lateral view) 41

2.4 Relative changes to cortical activity during REM sleep (medial view) 42

2.5 Relative changes to cortical activity during REM sleep (ventral view) 43

2.6 Mammalian memory taxonomy . . . . . . . . . . . . . . . . . . . 53

4.1 LCS: Environmental reward loop . . . . . . . . . . . . . . . . . . 112

4.2 LCS: State prediction reward loop . . . . . . . . . . . . . . . . . . 113

4.3 S → R→ S Classifier system as a combination S → R . . . . . . 114

4.4 LCS message and classifier cycle . . . . . . . . . . . . . . . . . . . 118

4.5 CFSC2 message frame . . . . . . . . . . . . . . . . . . . . . . . . 119

4.6 CFSC2 classifier frame . . . . . . . . . . . . . . . . . . . . . . . . 120

4.7 CFSC2 activation spreading within the message list. . . . . . . . . 121

4.8 ACS classifier configuration . . . . . . . . . . . . . . . . . . . . . 127

4.9 ACS Schematic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

4.10 ACS expectation passthrough operation . . . . . . . . . . . . . . . 130

4.11 Specification of changing components . . . . . . . . . . . . . . . . 132

4.12 Specification of un-changing components . . . . . . . . . . . . . . 134

5.1 Simplified conceptual view of waking perception . . . . . . . . . . 169

5.2 Simplified conceptual view of dreaming perception . . . . . . . . . 170

5.3 Conceptual relationship between hippocampus & cortex . . . . . . 182

5.4 Conceptual formation of memory representation . . . . . . . . . . 183

viii

LIST OF FIGURES

5.5 Example of the concept of suggestive adaptation . . . . . . . . . . 190

5.6 Long and short term dream memory proposal: Real mode . . . . 198

5.7 Long term behaviour with the context of short term memory . . . 199

5.8 Short term behaviour with the context of long term memory . . . 201

5.9 Integration of new experiences in the existing memory . . . . . . . 203

5.10 Reactive agent; the environment encodes state. . . . . . . . . . . . 212

5.11 Partially reactive agent. . . . . . . . . . . . . . . . . . . . . . . . 213

5.12 Minority reactive agent . . . . . . . . . . . . . . . . . . . . . . . . 214

5.13 Building perception . . . . . . . . . . . . . . . . . . . . . . . . . . 215

5.14 Perceptual ambiguity concept . . . . . . . . . . . . . . . . . . . . 216

6.1 ACS operation under the influence of future expectations . . . . . 233

6.2 Agent switching between realities. . . . . . . . . . . . . . . . . . . 234

6.3 ACS agent switching between reality and dream worlds . . . . . . 237

6.4 Selection strategy relationship during simulated dreaming . . . . . 238

6.5 Distribution balance for each session . . . . . . . . . . . . . . . . 241

6.6 Experiment 1: Bounded walk, normalised results . . . . . . . . . . 248

6.7 Experiment 2: Bounded walk, normalised results . . . . . . . . . . 250

6.8 Experiment 3: Bounded walk, normalised results . . . . . . . . . . 252

6.9 Experiment 4: Bounded walk, normalised results . . . . . . . . . . 254

6.10 Model competency by experiment (Correct) . . . . . . . . . . . . 255

6.11 Bad model responses by experiment (M nc) . . . . . . . . . . . . 255

6.12 Total cycles by experiment (R+M) . . . . . . . . . . . . . . . . . 256

6.13 Experiment 5: T-maze, normalised results . . . . . . . . . . . . . 264

7.1 State and concept relationship . . . . . . . . . . . . . . . . . . . . 271

7.2 Generalised state relationships to concepts . . . . . . . . . . . . . 271

7.3 Contrast of classifier rule application . . . . . . . . . . . . . . . . 272

7.4 Concept to state relationship assumption . . . . . . . . . . . . . . 277

7.5 A retrospective association . . . . . . . . . . . . . . . . . . . . . . 277

7.6 A speculative association . . . . . . . . . . . . . . . . . . . . . . . 278

7.7 The general state association representation . . . . . . . . . . . . 278

7.8 T-maze state coding . . . . . . . . . . . . . . . . . . . . . . . . . 279

ix

LIST OF FIGURES

8.1 Simple grid world . . . . . . . . . . . . . . . . . . . . . . . . . . . 307

8.2 A simple 2 dimensional maze. . . . . . . . . . . . . . . . . . . . . 310

8.3 Agent maze ambiguity. . . . . . . . . . . . . . . . . . . . . . . . . 315

8.4 Repeated maze transitions . . . . . . . . . . . . . . . . . . . . . . 317

8.5 The interim agent system. . . . . . . . . . . . . . . . . . . . . . . 318

8.6 Simple maze & target cell. . . . . . . . . . . . . . . . . . . . . . . 320

8.7 Disembodiment of the implied meanings. . . . . . . . . . . . . . . 320

8.8 The possible internal representations. . . . . . . . . . . . . . . . . 321

8.9 Automatic state mapping schema . . . . . . . . . . . . . . . . . . 327

A.1 Bounded walk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349

A.2 Symbolic maze representation of the bounded walk problem . . . 349

A.3 Latent learning of the bounded walk problem . . . . . . . . . . . 350

A.4 Rat in the T-maze experiment . . . . . . . . . . . . . . . . . . . . 352

A.5 T-maze state coding . . . . . . . . . . . . . . . . . . . . . . . . . 352

A.6 T-maze state transition diagram . . . . . . . . . . . . . . . . . . . 353

A.7 T-maze state and agent position . . . . . . . . . . . . . . . . . . . 353

A.8 T-maze and latent learning . . . . . . . . . . . . . . . . . . . . . . 355

A.9 T-maze state alternative coding II . . . . . . . . . . . . . . . . . . 355

A.10 T-maze state transition diagram: Coding II . . . . . . . . . . . . 356

A.11 T-maze state alternative coding III . . . . . . . . . . . . . . . . . 356

A.12 T-maze state transition diagram: Coding III . . . . . . . . . . . . 357

x

Chapter 1

Introduction

1.1 Paradox

Imagine a future situation where brain imaging has made considerable techno-

logical advances, way beyond our current capabilities. The advance is so great

that a device exists whereby the neural electrical activity of a subject can be

observed by merely donning a pair of special eye glasses. The glasses are able

to superimpose regional neural activity on the outline of a subject’s head (for

the purposes of discussion call these N-glasses). Zero neural activity produces no

superimposed colouring in contrast to bright luminous colours of high activity

producing an image similar to that rendered by contemporary PET scanners1 yet

dynamic, convenient and in real time.

By wearing N-glasses, merely walking down the high street or going to work

would make for a revealing experience. Despite the outwardly homogeneous ap-

pearance of the brain, like the body it has a remarkably consistent functional

locality that is consistent across individuals within species. The regional varia-

tion of neural activity of people and animals freely going about the business of

the day, would make a significant contribution to neuro-psychology, in addition

to interesting social engagements of those wearing N-glasses. Waking behaviour

patterns would of course lead to some fascinating insights, but what then would

the observation of sleeping individuals reveal? Intuitively one would think this

would bring a period of brain inactivity concomitant with the physical state.

1Positron Emission Tomography.

1

1.1 Paradox

Apart from the occasional dream most people would assume the brain, controller

of the body and host to the mind, would largely remain inactive. Herein resides

the paradox that is the basis for this research presented in this thesis. Through

the futuristic vision of the N-glasses a sleeping individual, mammal or bird would

reveal a startling contradiction to the notion that the sleeping brain like the body

rests.

Extrapolating the capability of the N-glasses still further, consider a visiting

alien watching the Earth from a low orbit. Viewing the Earth’s surface in terms

of human neural activity, how would the picture be interpreted? The light, day-

time side of the planet would be characterised by millions of moving pin points of

light, consistent in their illumination but of variable intensity. On the dark, night

time side of the planet the same pin points of lights largely remain motionless

and flicker brightly. Further investigation would only serve to confuse the situ-

ation. The light, daytime neural activity correlates with overt behaviour. The

dark, nighttime neural activity also initially is related to behaviour, but is later

confounded by long periodic bursts of neural activity close to the levels observed

during waking yet in the absence of associated waking behaviour. If any such

alien species were ever to witness this process, how would humanity explain this

phenomenon? Why, contrary to most logic, does the brain reactivate in apparent

isolation from the body?

Unfortunately the realisation of N-glasses is about as far away as the closest

likely source of intelligent life,1 but this fictional scenario serves to demonstrate

two important issues. Firstly, sleep is anything but rest for the brain, the restora-

tive bodily rest of sleep does not by association imply rest for the brain any more

than it does for the heart. Secondly, whilst there is no shortage of theories for

the need or reason for sleep and contribution of dreaming therein, none of these

are able to provide convincing supporting proof. The problems facing exposition

of a theory of sleep and dreaming closely parallel the equally confounded search

for an understanding of consciousness. The subjective experience of dreaming

during sleep is without doubt one of the most significant and tantalising clues to

understanding consciousness. The subjective experience of a dream demonstrates

1The Recently discovered planet ‘Gliese 581c’ (also see Star Trek Gene Roddenberry1966-69).

2

1.1 Paradox

that above all else the experience of life is primarily an abstract one. Waking be-

haviour is a result of the mind synchronised, trapped by the physical constraints

of the natural world. Imprinted, yet free from physical connections, the mind

does not cease to function, in fact quite the reverse. The aim of this research is

in no way a back door presentation of yet another theory of consciousness but

rather a direct, albeit at a low level, hypothesis on the cognitive utility of sleep

& dreaming and its application within the field of machine learning embodied in

the concept of Oneiric Processing1.

The scientific evidence for the contribution of sleep for both animals and hu-

mans in terms of cognitive homeostasis2 and in particular with regard to memory

is strong. The underlying mechanisms responsible for this contribution are less

clear, this is especially true of one particularly intriguing feature of mammalian

sleep, that of the dreaming process. The consensus amongst researchers in the

sleep and dreaming scientific community is, in general that dreaming does rep-

resent some off-line cognitive function, but there is a less popular, yet strong

opinion that dreaming serves no off-line function and is merely an observation of

the mind over a certain functional brain state. Despite all the claims and counter

claims in regard to the function of dreaming, one aspect of neural adaptation

makes any negative claim against dreaming as adaptation hard to justify. This

is the simple and well established phenomenon, that neural adaptation occurs by

activation. Waking experience changes neural circuits merely by their activation

(or inactivity), therefore why should the experience during sleep of a dream not

also result in adaptation? Holding this position, how does the experience of the

content of dreams contribute to cognitive homeostasis in humans, animals and

machines?

The natural world has been the basis of much inspiration for machine learn-

ing systems, from classical animal reinforcement learning, through physiological

neural synthesis to emulation of natural selection, all in search of explanation of

natural strategies of intelligence which in turn can lead to better systems and

1Oneiric of or relating to dreams or dreaming. Adapted from Oneiric Behaviour used todescribe REM sleep re-animation (Jouvet, 1979).

2Cognitive Homeostasis refers to the intrinsic process responsible for development and main-tenance of the normal balance between experience, perception and behaviour in a changingenvironment.

3

1.2 Nature

machines. However, most of the machine learning mechanisms for the abstract

manipulation of memory and agent structure are born of necessity and logical

progression, and this research proposal intends to take inspiration from dreaming

to improve machine learning systems directly.

1.2 Nature

Metaphorically, a master stroke of genius, eloquently captures the emergence of

cognition to such a degree as that found in modern humankind.1 After all the

millions upon millions of branches and deviations, dead ends, false starts and

mass extinction, it transpires that the most successful survival strategy is the

ability to become detached from the reality that was originator and host.2 Not

merely blindly reacting to the surrounding environment, but to cast a critical

eye at the surrounding environment and internally at ourselves. Evolutionary

pressure has stumbled into the domain of meta evolution. The emergence of an

organism that can consciously take evolution from the physical substrate to that

of an abstract one, is the context, the juncture, at which humankind currently

finds itself.

Crucially, the stage is set for such a leap, the embodiment of mind in ma-

chine. Yet in spite of the centuries of theorising, and very recent discoveries of

both processes leading to its formation and resulting function and environmental

interaction, the answer to how the mind is born of brain feels tantalisingly close,

but remains stubbornly elusive.

Several obvious paths are apparent in realising this goal. First, accept the

proposition that it is not possible for humankind to understand the mechanisms

that generate mind from brain. Two brute force approaches are apparent in

such a situation, namely copy the mechanisms (and conceivably, the state) of

the human brain exactly, or restart the evolutionary processes that led to its

development. The appearance of mind hence emerges from machine, without

actually understanding the mind, just the mechanisms leading to its creation.

1A more accurate description would be a master stroke of luck.2Reminiscent of the final deduction of the WOPR computer (aka Joshua) in the film War

Games (Warner Brothers, 1983) ‘the only winning move, is not to play’.

4

1.3 Background

Secondly, propose that it is possible to understand how mind results from the

brain and then embody that understanding within a machine.

These propositions can be more generally framed; can a machine N construct

another machine (N + 1) that is dimensionally superior? Dependent on the

definition of unit of dimension this process should result with ever increasing

meta cognitions that is hard to comprehend for the originating machine and

creator.

There is the very real possibility that the first proposition will succeed before

the second.1 Released from the current ethical constraints on human experi-

mentation, artificial embodiment may of course help in a greater understanding

leading to the satisfaction of the second proposition. However it remains unclear

that by blindly replicating and probably extending run time and capacity in the

first instance will result implicitly in the solution of the second.

The second of these propositions is indicative of the author’s belief and re-

flects the approach taken in this study; through elucidation of the mechanism,

not processes that resulted in the mechanism, comes understanding and exploita-

tion. Looking to Nature for clues in designed artificial adaptive systems2 does

in no way guarantee that a better engineered solution does not exist, just that

this is one example of a solution. This issue is mitigated in two regards; firstly

that exploration serves not only to inspire but also furthers understanding of the

Natural world; secondly the target environment for some of the control systems

is the Natural world, aligning environment and control system within the same

domain.

1.3 Background

The genesis for this research was a simple solution to a practical experimental

problem. A previous project involved designing an adaptive system based on the

concept of the Adaptive Heuristic Critic (AHC) learning to control an inverted

1The robot brain of Hector depicted in screen play by Martin Amis for the 1980 sciencefiction film Saturn 3 was grown from a human fetus and developed by copying the thoughtsand actions from a living human brain.

2‘Artificial’ as in a digital computer simulation of an adaptive system.

5

1.3 Background

pendulum1 [(Cannon, 1967), page 703], in a failure avoidance mode. The system

was inspired and closely resembled similar systems of (Barto et al., 1983) and

(Anderson, 1989). In the AHC one element of the system learns the error function

(the critic) the output of which is used to train another element (the controller).

In such a system the critic can learn an error function guided by only a simple

environmental reward or penalty. In the case of the cart pole problem the critic

is penalised in an identical manner both for failing to balance the pole and for

running out of track.2 One of the problems of this approach is that it may take

many trials for the critic to learn the error function, only then becoming useful in

terms of training the controller. Training output from the naive critic during this

phase can erroneously lead the controller away from the desired final mapping,

further increasing overall training time. Although in this sequential learning the

system eventually learns both to balance the pole and avoid the ends of the track,

this approach takes time.

A single computer system was host for neural network simulation, control and

graphics in addition to the simulation of the cart pole dynamics. This provided

a convenient development platform before an attempt to control a real cart-

pole system3. The Bang-Bang servo control4 presented no problems during the

simulation, but caused significant mechanical stress to the experimental cart-pole

construction often resulting in its destruction at some point into the training cycle.

Previous success in simulation and promising but as yet unfulfilled success

on the physical plant immediately lead to one obvious solution. Much of the

training time involved slowly adjusting the weight space by back propagation

(Rumelhart & McClelland, 1986) of the critic and then the controller, and much

of the training could have been achieved off-line in simulation before moving to

the real plant. This would lead to an overall reduction in mechanical stress and

the prospect of the real plant surviving long enough to balance the pole. By

normalising the simulation and real plant control variables the control system

1Also known as the cart pole problem.2This can lead to interesting controller behaviour; for example, the pole can be be delib-

erately tipped in one direction, risking failure in order that the controller can recover the polebalance and simultaneously shift to a new cart position in the same direction.

3The Plant.4At every control interval, a fixed force is always applied either positive or negative to the

cart base. A variable or zero force is not permitted.

6

1.3 Background

could seamlessly switch between a simulation of the cart pole and the real plant.

Re-running the experiment by periodically alternating between the real plant and

the simulation allowed some of the real experience (the inclusion of many non-

linearities not represented in the model) to be carried through into the model

interaction phase. Therefore learning was facilitated by an alternation between

the model and real plant, rather than complete prior training in simulation. The

physical plant was eventually balanced though training via this periodic switching

between the simulation and the real plant (Holley, 1996).

The experimental fix was in reality a reaction to poorly constructed experi-

ment apparatus and defeated the original intention of the system of learning in

the absence of any detailed environmental reward, or simply of avoiding failure.

The periodic nature of interacting with the external plant and internal model

in order to reduce stressful real plant interaction was suggestive of an analogy

between waking and sleep in humans and animals.1 The benefit of this approach

in the experiment was clear, but in the natural world how do sleep and dream-

ing contribute to waking behaviour? This set the scene for an investigation into

the science of sleep and dream research with an aim to building better artificial

adaptive systems.

This previous work was not aimed at investigating off-line training policies,

but it did give rise to questions about off-line adaptation and eventually to the

initial question:

Given a quantity of experience, how best can this knowledge be utilised

in order to become more successful in the future?

In the case of the previous example the motivation was limited to reducing

learning time through a preset user goal, to minimise plant damage. Other cases

may be more complex; where the environment is changing, goals moving, or

perhaps both. In those cases simply reiterating sound previous examples will be

detrimental. In the cart-pole problem for example, if an extra jointed section

was added to the pole on the real plant and not on the simulation, simulated

experience would be detrimental. Conversely there would be no benefit exposing

1At the time based on the subjective experience of dreaming, the incorporation of wakingexperience into dreams.

7

1.4 Methodology

the system to a multi-jointed pole simulation if there is never a chance that this

will become reality.

These thoughts led to investigations of model learning and planning where

various strategies are applied to improve agent performance, for example reduce

interaction with the real world, improve convergence time etc. Direct planning

for example uses a model of the world to search out goals, prior to interacting

with the actual world, such as in maze solving.

Any agent that interacts with a problem, irrespective of whether a model

is being generated from that world, faces the exploration/exploitation trade off.

This policy depends heavily on the problem, but one solution may be; ‘exploit

all the time and only explore when absolutely necessary’, or ‘exploit mostly and

occasionally explore’. Consider an agent that generates a model of the world. The

agent can use the model in various ways, but suppose the agent interacts with

the model as though the model were the real world? In this situation how does

exploration occur since the model cannot know the correct ‘real world’ response

in such a situation? Nevertheless, it may be that this is one of the most useful

off-line polices, since this is where exploration is safe. The dilemma then is clear,

abstract exploration may be safe but it may also be incorrect and thus eventually

unsafe.

These ideas may have a natural parallel, dreaming or at least the concept of

dreaming. The reiteration of coherent threads or stories with bizarre deviations

as though real may be a natural strategy to prepare animals for the future based

on the past.

1.4 Methodology

Appreciation of the size and complexity of the cognitive domain in which this

work is a part, is accepted. The possibility of designing a large simulation of

advanced cognitive functions is rejected in favour of a more focused and dis-

parate approach. In this thesis, the approach is based on investigation of core

issues in separation, with a view to integration later. Speculation, justification,

application, experimentation and analysis of isolated issues are sought at a level

that removes all but the essence of the respective issue. Although the applied

8

1.4 Methodology

problems appear trivial, they are applied within the context of extrapolation and

scalability to larger real world problems. The primary aim is to produce insights

to further cognitive machine learning systems. Conjunction, discussion, design

and operation of such systems contribute towards the cognitive faculties of sleep

and dreaming within the natural world. Below is an overview of the approach

taken to the problem:-

1. Review of the facts surrounding sleep & dreaming

2. Selection of the salient aspects

3. Reduction of those aspects

4. Hypothesis, proposed models

5. Implementation

6. Review of approach

The problem approach is characterised by a broad exploration of sleep and

dreaming in conjunction with a narrow exploration of a particular class of machine

learning systems, specifically learning classifier systems (LCS). A comprehensive

investigation in such a vast and dynamic field of sleep and dreaming research is

difficult1, and in relation to the research review there a tendency towards memory

and neurology and negation of evolutionary and other aspects.

A large portion of the research has been dedicated to a deliberately broad

investigation into the current state of the art as regards to sleep and dream

research. Analogous comparison between the subjective experience of dreaming

and off-line machine learning adaptation may well have been the catalyst for this

research but it cannot alone be the justification. Therefore a broad investigation

into the nature of sleep and dreaming was required, the approach is graphically

presented in Fig. 1.1.

1The standard medical text on sleep and dreaming divides into 125 subject areas in 1475pages (Roth et al., 2005).

9

1.4 Methodology

Figure 1.1: Parallel research approach

10

1.5 Thesis Layout

1.5 Thesis Layout

The research disseminated in this thesis has been organised into 3 major sec-

tions. Firstly Part I examines the current state of the art in reference to sleep

and dream research, lays the foundation for the justification for the core hy-

pothesis and provides linkage into the learning classifier systems (LCS) machine

learning domain. Secondly Part II examines the background and implementa-

tion of related anticipatory based learning classifier systems. Finally Part III

presents the synergy of the dual paradigm research and presents the hypothesis,

proposed systems and experimental results, terminating with a research summary

and conclusions.

1.5.1 Part I: Sleep, Dreaming & Learning

Sleep is almost exclusively an essential behavioural component of all animal life

on Earth. In order to justify an argument for a cognitive adaptive component of

sleep, especially in regards to dreaming, this section reviews the current state of

knowledge from the perspective of the sleep and dreaming research field.

Sleep is the substrate of dreaming. To derive a system of adaptation based on

dreaming without considering the broader field of sleep would be a too narrow

and naive an approach. Sleep as with waking, is a catalyst for a vast range of

biochemical, cellular and systematic changes both physiological and psychologi-

cal.

One aspect that requires special attention is the elucidation of the existence

of dreaming in species other than humans. Whilst anecdotal evidence for dream-

ing in higher animals is strong, communication of that subjective experience is

of course not possible. Therefore alternative methods of investigation are re-

quired in order to support the circumstantial evidence refuting anthropomorphic

tendencies. Several sections describe research that justifies the argument that

rodents, birds and cats experience a sleeping mentation which closely resembles

the structure if not the complexity of human dreaming.

11

1.5 Thesis Layout

1.5.2 Part II: Latent & Machine Learning

For any animal or artificial agent to operate abstractly from the environment there

are two essential minimum requirements. Firstly there must be some method of

associating or linking internal representations of environmentally derived expe-

riences. Secondly there must be a internal platform or space in which to replay

these associations. Therefore in order to plan, imagine and ultimately dream an

artificial cognitive system must have the ability to either directly or indirectly link

internal representations of environmental states (and behaviour) and provide an

arena independent of the environment in which to replay internal representations

in isolation from the environment.

Given such prerequisite abilities, an animal or agent must firstly learn the

environmental dynamics (typically by analysis of reaction to direct agent actions)

in order that the utility of cognitive facets can be utilised. Therefore in order

for an animal or agent to perform better in the future, learning for the sake of

learning, learning in the absence of any obvious reward is in itself rewarding. In

animal psychology this is the concept of Latent Learning. A combination of latent

learning and cognitive features has been expressed within the machine learning

field of learning classifier systems.

This section of the thesis firstly explores the natural phenomenon of latent

learning and then reviews the various LCS incarnations that both learn latently

and present an architecture for cognitive behaviour.

1.5.3 Part III: Heterogenesis

1.5.3.1 Hypothesis

A three step approach is taken in this chapter to form the bridge between the

sleep and dreaming research described in Part I, the latent learning and latent

learning systems of Part II and the remainder of the thesis which contains several

complimentary theories, experimental work, developmental adaptive systems and

concluding hypothesis.

The first step is concerned with disseminating the sleep and dreaming research

described in Part I into a set of observations relevant to adaptive systems. On the

basis of this set of observations the second step develops several complimentary

12

1.5 Thesis Layout

theories on the adaptive contribution of sleep and dreaming. The third step

presents an Integrative View; this orientates Oneiric Machine Learning within

the wider field of adaptive systems, presents a formal definition and concludes

with a discussion relating the oneiric approach to the general issues facing machine

learning.

1.5.3.2 Consolidation

This chapter reports on a modification to the Anticipatory Classifier System

(ACS) (Stolzmann, 1996) that allows the system to periodically supplant real

experience for experience artificially generated from the current and transitory

information represented in the classifier list. The resulting ACS agent stores

experience by creating generalised classifier rules on-line and then periodically

switches from interacting with the environment to its current internal represen-

tation of the environment. The system incorporates some of the key aspects of

dreaming explored in the previous section, namely experience dependent reacti-

vation, emergent self perpetuating thread creation and off-line adaptation.

1.5.3.3 Generalisation

This chapter Generalisation and the following chapter (Augmentation) describe

unfinished developmental work with anticipatory classifier style systems similar to

those described in Related Systems and the adapted ACS described in Consolida-

tion. In both chapters ideas based on theories developed in Hypothesis are used

to create two prototype anticipatory style classifier systems that present novel

approaches to managing state generalisation and state augmentation. Effectively

these are the first steps towards a complete oneiric based classifier system.

In this chapter (Generalisation) a new (concept) classifier system The Coupled

Classifier System (CCS) is introduced. This system attempts to resolve simulta-

neously issues of capacity and generalisation that the ACS (and other similarly

structured systems) do not tackle. The CCS is a classifier system designed specif-

ically as a platform for Oneiric Processing. The CCS is built from two key ideas

born from the sleep and dream research alongside issues of other anticipatory

classifier systems. Firstly the CCS is structurally different from other anticipa-

tory classifier systems (see Sect. 4) in so much as matching experiences point or

13

1.5 Thesis Layout

are coupled to other experiences negating the need for state, action, state triplets

but simply states and indirect associations with other states for example; from

S → A→ S to S → A→ L(S), where L represents a linkage, coupling or redirec-

tion to another existing matching state (condition part). Secondly the experience

is not lost as a result of on-line generalisation but is resolved at periodic off-line

intervals by a process dealing with capacity. Experiences are broken down into

cognitive building blocks that (a) gracefully resolve capacity and by their creation

(b) promote generalised representations and creative behaviour in the real world.

1.5.3.4 State Augmentation

In broad terms sleep not only aids the retention of memory, but simultaneously

changes the representation promoting later (waking) generalisation. The con-

tents of this chapter are the results of thoughts regarding the abstract (sleeping)

resolution of problems and ambiguities introduced during (waking) interaction

with the environment. It is obviously difficult to represent the sort of abstract

human problems known to benefit from sleep. Whilst still very complex systems

rodent sleep studies have demonstrated that rats dream about past (and possibly

future) maze tasks. Rodent sleep studies employ simple mazes in order to relate

neural activity to location (Sect. 2.8.2) to detect dreaming rather than the effect

of sleep in solving the mazes. Nevertheless this relationship between re-running

mazes during sleep and the problems of maze solving in machine learning led to

thoughts of maze resolution in machine learning and especially in respect to clas-

sifier representations. Resolution in this context is not solving the maze, rather

the successful mapping, which subsequently promotes solution of the maze. In

particular the issue of state aliasing in non-Markov mazes remains a difficult is-

sue for learning classifier style systems. This chapter describes one (online-offline,

waking-sleeping) approach to the resolution of such maze environments by auto-

matically resolving environmental ambiguities with additional (perceptual tags)

later resolved by repeated offline activation; state augmentation and offline me-

diated resolution.

14

1.6 Audience

1.6 Audience

This thesis brings together aspects from two disparate and developing fields of

research; that of sleep and dreaming and that of machine learning. Inspiration

from the former is used to further advance the latter. Background reviews into

both areas form the foundation for the work linked by a central Hypothesis sec-

tion (Sect. 5). In this section ideas from sleep and dreaming are gathered into

a set of observations and related to the issues facing machine learning. Finally

experimental work and prototype systems are presented using anticipatory clas-

sifier style architecture. The work is mainly aimed at those working within the

field of artificial adaptive systems and Natural Computation1 who may have no

knowledge of sleep and dreaming. Nevertheless this work is also likely to attract

readers from sleep and dreaming and other closely related fields. The main con-

tribution of the work in the Hypothesis and Conclusion section is recommended

for both audiences. The whole document is recommended for those approaching

from the general field of artificial adaptive systems. Those approaching the docu-

ment without any knowledge of machine learning can ignore the sections Related

Systems, the experimental work Consolidation and systems development work of

the sections Generalisation and Augmentation without losing sense of the main

contributions.

1Artificial life, genetic algorithms, swarm behaviour and neural networks are some exampleswithin this genre.

15