case-based reasoning in personnel rostering · 2015-07-28 · case-based reasoning in personnel...

Case-Based Reasoning in PersonnelRostering

by Gareth Richard Beddoe, BSc

Thesis submitted to The University of Nottingham

for the degree of Doctor of Philosophy, 2004

Contents

List of Tables viii

List of Figures x

1 Introduction 11.1 Personnel Rostering 11.2 Case-Based Reasoning 51.3 Research Objectives 71.4 Layout of the Thesis 9

I Background 12

2 Personnel Rostering Problems 132.1 Introduction 132.2 Dimensions 142.3 Rostering Approaches 262.4 Conclusion 58

3 Case-Based Reasoning 613.1 Introduction 613.2 Methodology and Research Issues 643.3 Case-Based Reasoning for Scheduling and Planning 743.4 Conclusion 82

II The CABAROST Model 85

4 A Nurse Rostering Model 864.1 Introduction 864.2 Nurses and Shifts 884.3 Constraints and Constraint Violations 924.4 Repairs 98

CONTENTS iii

4.5 Problem and Solution Spaces 994.6 Problem Data 1014.7 Conclusion 103

5 Case-based Repair Generation 1065.1 Introduction 1065.2 Case Structure 1075.3 Case Retrieval 1145.4 Repair Adaptation 1205.5 Example 1245.6 An Extended Adaptation Algorithm 1275.7 Case-base Training 1315.8 Performance 1325.9 Conclusion 136

6 Violation Features and Weighting 1376.1 Introduction 1376.2 Measuring Classification Accuracy 1396.3 Violation Features 1406.4 Genetic Algorithm for Feature Weighting and Selection 1476.5 Results 1496.6 Conclusion 161

III Meta-heuristic Hybrids 163

7 Combining CABAROST with Tabu Search 1647.1 Introduction 1647.2 Algorithm Variants 1657.3 Comparison of Algorithms 1717.4 Conclusion 181

8 A Memetic Algorithm for Determining Repair Orderings 1848.1 Introduction 1848.2 Memetic Algorithm 1858.3 Learning 1878.4 System Evaluation 1898.5 Conclusion 197

CONTENTS iv

IV Discussion 199

9 Conclusions 2009.1 Contribution 2009.2 Method Evaluation 2119.3 Applicability to Other Domains 2139.4 Future Work 2159.5 Dissemination 216

A Violation Feature Indices 221

B Repair Adaptation Rules 229

References 237

Abstract

In this thesis a novel Case-Based Reasoning (CBR) system called CABAROST (CAsed-

BAsed ROSTering) for a complex, real-world, personnel rostering problem is pre-

sented. CBR is an artificial intelligence paradigm which attempts to solve new prob-

lems using information about the solution to previously encountered problems. CBR

is used to capture and store examples of expert rostering behaviour which are then

used to solve future problems. Previous examples of constraint violations in rosters

and the repairs that were used to solve the violations are stored as cases. CABAROST

generates repairs for violations found in a roster which imitate the decision making

practice of the expert which trained it.

A number of research issues that arise from using CBR for personnel rostering

problems are addressed including: (a) representation of the nurse rostering problem

as a constraint optimisation problem; (b) generalisation, selection, and weighting

of case indices which will make the cases applicable to new problem instances; (c)

retrieving and adapting cases from the case-base so that they are suitable to new

problems; (d) hybridisation of CBR with meta-heuristic search methods; (e) and

on-going case-base training and learning from failure.

The research was carried out in collaboration with the Queen’s Medical Centre

University Hospital NHS Trust, Nottingham, UK. They provided their experience in

rostering in the form of examples and real-world data, and are actively involved in the

testing and evaluation of the developed software system. In addition, the methods

are applicable more generally to a wide variety of scheduling and other combinatorial

optimisation problems.

Acknowledgements

I would like to thank my supervisor Sanja Petrovic for all the advice and support

that she gave me throughout my doctoral studies.

This research would not have been possible without the advice and expertise of Ella

Bowers, Ann Watts, and the staff of the Ophthalmology ward at the Queens Medical

Centre University Hospital NHS Trust, Nottingham, UK. I appreciate the time that

they took out of their busy schedules to help us with this work.

The project was sponsored by the Engineering and Physical Sciences Research Council

(EPSRC) under grant number GR/N35205/01.

Finally, I would like to thank the staff and students of the ASAP Research Group for

making my time with them so enjoyable and productive.

To Mum and Dad,

Thank you for all the support and en-couragement you have given me overthe years. I couldn’t have done thiswithout you.

List of Tables

4.1 The fields of NurseType . . . . . . . . . . . . . . . . . . . . . . . . . 90

5.1 Reassign repair features . . . . . . . . . . . . . . . . . . . . . . . . . 1155.2 Swap repair features . . . . . . . . . . . . . . . . . . . . . . . . . . . 1155.3 Switch repair features . . . . . . . . . . . . . . . . . . . . . . . . . . . 1185.4 A simple example of a nurse roster . . . . . . . . . . . . . . . . . . . 1245.5 Example of case ranking . . . . . . . . . . . . . . . . . . . . . . . . . 1255.6 Example of repair ranking . . . . . . . . . . . . . . . . . . . . . . . . 127

6.1 An example roster . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1436.2 The features used to represent constraint violations in the case-base . 1466.3 Classification accuracy (with full initial feature set) . . . . . . . . . . 1516.4 The average number of selected features (with full initial feature set) 1526.5 Classification accuracy (with refined initial feature set) . . . . . . . . 1546.6 Average number of selected features (with refined initial feature set) . 154

7.1 Algorithm Performance. . . . . . . . . . . . . . . . . . . . . . . . . . 183

8.1 Constraints for the QMC problem . . . . . . . . . . . . . . . . . . . . 1908.2 Initial case-base contents . . . . . . . . . . . . . . . . . . . . . . . . . 1908.3 Roster quality - CABAROST vs. Random Repair Generation . . . . 1918.4 Roster quality: static (no training) vs. ongoing training . . . . . . . . 1928.5 Roster quality: static (no case weighting) vs. case weighting on failure 194

A.1 Cover constraint violation feature indices . . . . . . . . . . . . . . . . 221A.2 HardRequest constraint violation feature indices . . . . . . . . . . . . 222A.3 MaxDaysOn constraint violation feature indices . . . . . . . . . . . . 223A.4 MaxHours constraint violation feature indices . . . . . . . . . . . . . 223A.5 MinDaysOn constraint violation feature indices . . . . . . . . . . . . 224A.6 MinHours constraint violation feature indices . . . . . . . . . . . . . . 224A.7 SingleNight constraint violation feature indices . . . . . . . . . . . . . 225A.8 SoftRequest constraint violation feature indices . . . . . . . . . . . . 226A.9 Succession constraint violation feature indices . . . . . . . . . . . . . 227

LIST OF TABLES ix

A.10 WeekendBalance constraint violation feature indices . . . . . . . . . . 227A.11 WeekendFill constraint violation feature indices . . . . . . . . . . . . 228A.12 WeekendsInARow constraint violation feature indices . . . . . . . . . 228

List of Figures

3.1 A reasoning episode according to Kolodner (1993) . . . . . . . . . . . 653.2 The CBR cycle by Aamodt and Plaza (1994) . . . . . . . . . . . . . . 66

4.1 Nurse hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 914.2 Basic repair action types . . . . . . . . . . . . . . . . . . . . . . . . . 994.3 Sample preference roster . . . . . . . . . . . . . . . . . . . . . . . . . 104

5.1 Example: Generalisation of a Cover Violation . . . . . . . . . . . . . 1105.2 Example: Generalisation of a HardRequest Violation . . . . . . . . . 1115.3 Example: Generalisation of a Succession Violation . . . . . . . . . . . 1115.4 Example: Generalisation of a Reassign Repair . . . . . . . . . . . . . 1145.5 Example: Generalisation of a Swap Repair . . . . . . . . . . . . . . . 1165.6 Example: Generalisation of a Switch Repair . . . . . . . . . . . . . . 1175.7 Case structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1185.8 The retrieval algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 1195.9 Adaptation algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 1225.10 The alternative adaptation algorithm . . . . . . . . . . . . . . . . . . 1295.11 Average cumulative number of exact and equivalent matches against

case-base size over five 120 iteration runs. . . . . . . . . . . . . . . . . 1355.12 Effects of case-base size on solution quality. . . . . . . . . . . . . . . . 135

6.1 Pseudo-code for the algorithm for measuring classification accuracy . 1396.2 Feature weights for Cover Violations . . . . . . . . . . . . . . . . . . 1556.3 Feature weights for MaxDaysOn Violations . . . . . . . . . . . . . . . 1566.4 Feature weights for MaxHours Violations . . . . . . . . . . . . . . . . 1576.5 Feature weights for MinDaysOn Violations . . . . . . . . . . . . . . . 1586.6 Feature weights for Succession Violations . . . . . . . . . . . . . . . . 159

7.1 Mean, maximum, and minimum number of constraint violations . . . 1747.2 Mean, maximum, and minimum nurse satisfaction . . . . . . . . . . . 1757.3 Effects of tabu list on average number of constraint violations for the

MARCH 2001 problem . . . . . . . . . . . . . . . . . . . . . . . . . . 176

LIST OF FIGURES xi

7.4 Effects of tabu list on average nurse preference satisfaction for theMARCH 2001 problem . . . . . . . . . . . . . . . . . . . . . . . . . . 176

7.5 Effects of objective function on average number of constraint violationsfor the APRIL 2001 problem . . . . . . . . . . . . . . . . . . . . . . . 177

7.6 Effects of objective function on average nurse preference satisfactionfor the APRIL 2001 problem . . . . . . . . . . . . . . . . . . . . . . . 177

7.7 A manually produced roster . . . . . . . . . . . . . . . . . . . . . . . 1797.8 A roster produced using CB-OBJ-TL-R10 . . . . . . . . . . . . . . . 180

8.1 Roster quality: static (no training) vs. ongoing training . . . . . . . . 1938.2 On-going training: number of queries (1.5 and 2.5 thresholds) . . . . 1948.3 Roster quality: case weighting on failure . . . . . . . . . . . . . . . . 1958.4 A roster produced using the memetic algorithm . . . . . . . . . . . . 196

B.1 Adaptation rule for generating Reassign repairs of Cover violations . 229B.2 Adaptation rules for generating Reassign repairs of HardRequest, Sin-

gleNight, SoftRequest, Succession, and WeekendSplit violations . . . . 230B.3 Adaptation rules for generating Reassign repairs of MaxDaysOn, Max-

Hours, MinDaysOn, and MinHours violations . . . . . . . . . . . . . 230B.4 Adaptation rules for generating Reassign repairs of WeekendBalance

and WeekendsInARow violations . . . . . . . . . . . . . . . . . . . . 231B.5 Adaptation rule for generating Swap repairs of Cover violations . . . 231B.6 Adaptation rules for generating Swap repairs of HardRequest, Sin-

gleNight, SoftRequest, Succession, and WeekendSplit violations . . . . 232B.7 Adaptation rules for generating Swap repairs of MaxDaysOn, Max-

Hours, MinDaysOn, and MinHours violations . . . . . . . . . . . . . 233B.8 Adaptation rules for generating Swap repairs of WeekendBalance and

WeekendsInARow violations . . . . . . . . . . . . . . . . . . . . . . . 233B.9 Adaptation rule for generating Switch repairs of Cover violations . . . 234B.10 Adaptation rules for generating Switch repairs of HardRequest, Sin-

gleNight, SoftRequest, Succession, and WeekendSplit violations . . . . 234B.11 Adaptation rules for generating Switch repairs of MaxDaysOn, Max-

Hours, MinDaysOn, and MinHours violations . . . . . . . . . . . . . 235B.12 Adaptation rules for generating Switch repairs of WeekendBalance and

WeekendsInARow violations . . . . . . . . . . . . . . . . . . . . . . . 236

Chapter 1

Introduction

In this thesis a novel case-based reasoning approach to personnel rostering is pre-

sented. The problem of rostering nurses at the Queens Medical Centre, Nottingham,

UK will be investigated. A framework for solving this problem is presented which

is sufficiently general to apply to many similar problems. In this chapter the basic

concepts of personnel rostering and case-based reasoning are introduced and the over-

all research objectives identified. The layout of the thesis and the contents of each

chapter will be described.

1.1 Personnel Rostering

The problem of allocating working patterns to employees is both difficult and time

consuming. Personnel managers often spend a large percentage of their time con-

structing rosters and in most cases do not succeed in producing rosters which satisfy

both operational requirements and staff preference. Poorly constructed rosters can

have a number of detrimental effects on an organisation’s performance. Overstaffing

leads to high employment costs but is often necessary for organisations with unpre-

dictable service demands. Conversely, understaffing can result in lost revenues or

poor service provision. Rosters which are perceived by employees to be inflexible or

unfair can significantly affect staff morale and lead to increased absenteeism and staff

turnover.

Rostering problems are found in many different types of organisations and in-

1. introduction 2

dustries, including manufacturing, call centres, schools, emergency services, energy

producers, and transportation. This thesis will focus on the health-care industry, and

in particular on nurse rostering. Nurse rostering provides some of the most highly

constrained and difficult personnel rostering problems. Hospitals must balance staff

shortages and budget restrictions with consistent and high quality patient care. In

recent years, a recognition of the importance of flexible employment practices has in-

creased the need for automated rostering tools which can cope with ever more complex

problems.

Personnel rostering is defined as the problem of placing resources into slots in a

pattern, subject to given constraints, where the pattern denotes a set of legal shifts

defined in terms of work to be done [251]. However, the constraints imposed on

the problem cannot often be satisfied completely, and the aim is to find the solu-

tion which violates as few as possible. Real-world personnel rostering problems are

highly constrained resource allocation problems which are difficult to solve manually

[72]. Conflicting legal, management, and staff requirements must be considered when

making rostering decisions. For example, management requirements for the cover and

skill mix needed for a particular task often conflict with the maximum working hours

allowed (by law and contract) as well as with individual staff preferences. In general,

it is difficult, if not impossible, to predict how an attempt to satisfy one constraint

will affect the others.

Personnel rostering problems have been a subject of interest within both artificial

intelligence and operational research communities for a number of decades. Tradi-

tional operational research optimisation methods, especially mathematical program-

ming, were initially employed to solve simple problems with few constraints. However,

these were rarely useful for real-world problems, particularly as they frequently in-

curred computational cost exponential in the number of employees considered. Many

mathematical formulations of personnel rostering problems are NP-hard - at the time

of writing no algorithm is known which will solve such problems in polynomial time

[19]. Consequently, many optimisation techniques from the artificial intelligence com-

munity have been explored for finding good quality (but not provably optimal) solu-

tions. Rostering problems are often represented as constraint optimisation problems,

1. introduction 3

where the objective is to minimise the overall violation of constraints. Many ap-

proaches have been developed using constraint satisfaction methods or meta-heuristic

techniques such as tabu search, simulated annealing, and evolutionary algorithms. In

general, they measure roster quality using objective functions which represent the vi-

olation of constraints in rosters, usually weighted according to the perceived relative

importance of each constraint type.

From a mathematical modelling perspective personnel rostering problems are dif-

ficult for a number of reasons:

- combinatorial solution space: the number of possible solutions to rostering prob-

lems usually increases exponentially with the number of staff, shift types, and

with the size of the planning horizon [208]. Consequently, brute-force methods

quickly become computationally intractable for all but the smallest of problems.

- highly constrained : operational requirements, complex employment laws, and staff

requests all translate into a large number of constraints. For most problems it is

impossible to satisfy all constraints and so solutions must be found which meet

a core set of strict requirements, and otherwise violate as few of the remaining

constraints as possible [235].

- organisation-dependent : the definitions of staff skill, shift types, planning periods,

and, of course, of the myriad constraints vary considerably both between and

within organisations. Methods suitable for one problem are not easily trans-

ferred to other problems - and frequently the unsatisfactory approach of ‘chang-

ing the problem to suit the method’ is employed, with poor results. The problem

modelling process also suffers from the knowledge acquisition bottleneck due to

the difficulties present in translating operational information into mathematical

models.

- subjective: also linked to problems of knowledge acquisition is the issue of subjectiv-

ity [174]. Many of the decisions made by rostering experts (for example nurse

managers) are of a personal, subjective nature, and are therefore difficult to

model systematically. This is particularly true when considering the treatment

1. introduction 4

of staff requests and preferences, which may need to be treated on a case by

case basis.

Despite decades of research into automated methods, and some significant ‘aca-

demic’ successes, many organisations still resort to manual practices. Managers are

reluctant to use automated rostering tools because they cannot use them to accu-

rately and easily model their problems. Some of the unresolved issues in personnel

rostering research are:

(a) Representation of domain knowledge: The development of successful personnel

rostering methods is highly dependent on the correct representation of domain

knowledge. Many of the early mathematical models were not flexible because of

the rigid structure that they imposed on the problem representation. Although

there has been a lot of research effort in the development of artificial intelligence

techniques for solving scheduling problems in general, and personnel rostering

in particular, the development of knowledge-based scheduling systems, which

attempt to emulate the methods of the human schedulers, is recognised to be a

difficult task [128]. In view of the complexity of real-world scheduling problems

it is very difficult to define a list of ‘IF-THEN’ rules that describe the reasoning

of scheduling managers. It is an inexact and time-consuming process, and can

also lead to the development of inflexible and incomplete domain models [192,

222]. The nature of the human scheduling process is based on specific experience

rather than on a set of general guidelines or first principles.

(b) Adapting the domain model : Personnel rostering problems can change rapidly

due to evolving employment law, organisational restructuring, staff turn-over,

and service demand, and most models struggle to ‘keep up’. Even the meta-

heuristic methods that have been developed for rostering suffer from inflexibility.

Their drawback is that they usually work well only in environments that are very

similar to the instance of the problem for which they were designed. Each new

instance of the problem requires significant changes to the model, for example,

changes to the objective function which depend heavily on the researcher’s a

1. introduction 5

priori understanding of the problem, changes to the neighbourhood structure

in local search methods, etc.

(c) Reacting to unexpected events : Organisations such as hospitals must provide ser-

vices around the clock. Events such as sudden increases in demand, unplanned

leave, and staff sickness must be dealt with in ‘real-time’ by altering the ros-

ter [115]. Such alterations can have long-term effects on the satisfaction of

constraints over the remainder of the planning period. Most automated meth-

ods are unable to handle such events gracefully and managers must resort to

rescheduling manually.

These issues are addressed in this thesis by the introduction of a novel method for

acquiring rostering knowledge from domain experts. This method provides a highly

intuitive means by which experts can specify their treatment of rostering decisions.

It is demonstrated that methods which use the acquired knowledge are capable of

imitating human rostering methods and producing good quality rosters.

1.2 Case-Based Reasoning

Case-based reasoning (CBR) is an artificial intelligence methodology which uses so-

lutions to previous problems to solve similar new problems. A data-base of past

experience, called a case-base, is constructed and then used to solve new problems

through processes of retrieval, adaptation, and memorising. Each experience is stored

within a case, which usually represents a description of the problem, the solution, and

an assessment of the outcome of using the case. Because knowledge is represented in

CBR by example, rather than in the form of rules or functions, it can be used to model

problems in complex real-world domains which are difficult to model systematically.

CBR operates on the premise that ‘similar problems will require similar solutions’.

Indeed, the notion of similarity, of both problems and solutions, is key to the devel-

opment of successful case-based systems. Problems are represented by sets of feature

indices which store information about the problem relevant to the problem solving it

will be used for. Similarity between problems is usually measured as a function of the

1. introduction 6

differences between index values, although many other schemes exist. The choice of

relevant case indices and accurate similarity measures are two of the most important

design tasks faced by developers of CBR systems.

When a new problem is encountered a CBR system must first search for the most

similar case (or cases) in the case-base. This process of retrieval may use a similarity

measure to rank the cases in the case-base. The retrieved case, or cases, contain the

solutions that were used to solve the problems that they represent. These solutions

are used to solve the new problem. However, it is rarely possible to directly apply

the old solution to the new problem, and therefore solutions must be adapted to

the context of the new problem. Solutions in cases may represent instantiations of

the domain variables, or the methods (and method parameters) that were previously

Learning is achieved in CBR systems by memorising new cases as they are en-

countered. When supervised by human experts case-based systems can be trained

both initially and continuously throughout their lifetime. Under automated opera-

tion case memorisation must be undertaken with great care, to ensure that the cases

stored are relevant, useful, and consistent with the existing knowledge. A large body

of research is dedicated to the subject of case-base maintenance, which attempts to

ensure that the quality of the cases in the case-base is maximal, and that the number

of cases in the case-base does not increase unnecessarily (and thus impact negatively

on the retrieval speed).

CBR has been used successfully for a wide range of applications including planning

and motivational analysis, navigation, building and circuit board design, military bat-

tle planning, purchase selection, legal analysis, machine control, medical diagnosis,

automated theorem proving, classification, etc. It is particularly suited to complex

real-world domains where problems can be solved (albeit slowly) by human experts,

who rely on a combination of intuition, subjective judgement, and ‘rules-of-thumb’.

CBR provides an intuitive interface for eliciting knowledge from domain experts be-

cause it requires only examples of problem solving episodes and not systematic details

about how or why a particular method or solution was used. It can be applied to

problems in a number of ways, from fully supervised interactive tools through to fully

1. introduction 7

automated solution finding methods.

Recent years have seen an acceleration of interest in developing CBR systems

for a variety of combinatorial optimisation problems including scheduling. In this

thesis the focus will be on CBR applications in scheduling. These approaches use

CBR to produce solutions in a variety of different ways. Some methods store full or

partial solutions to scheduling problems which are re-used for problems with similar

descriptions. Other methods are used to suggest algorithms or heuristics for solving

problems based on certain descriptive or structural features. A third variety store

examples of repair operations which are applied to schedules in certain circumstances

in order to improve their quality. In this thesis, the operator re-use approach is

explored for personnel rostering problems.

As a solution methodology CBR has great potential in the field of personnel roster-

ing. Staff managers exist with many years of experience at solving real-world rostering

problems. These experts can provide a case-by-case history of rostering decisions with

which to seed a case-base. The flexibility of CBR means that after training, solutions

can be provided without supervision or, if the user wishes, a CBR system can be used

as a decision support system whereby problems are solved interactively.

1.3 Research Objectives

In this thesis a CBR approach to personnel rostering is investigated. The novel CAsed-

BAsed ROSTering (CABAROST) method was developed to imitate the rostering

decisions made by human rostering experts. During its development the field of

personnel rostering was thoroughly investigated. This thesis presents the outcomes of

research guided by the following objectives regarding the field of personnel rostering:

(a) To gain an understanding of the complexity of real-world rostering problems, in

particular focusing on the use and treatment of constraints.

(b) To explore the use of various methods for solving rostering problems and to

determine the current state-of-the-art in rostering technology. In particular,

1. introduction 8

those methods developed recently for highly constrained real-world problems

should be assessed.

(c) To model a complex rostering problem from a UK hospital and to explore the

relevance of this model to other problems in the literature.

(d) To investigate the nature of human expert decision making for personnel ros-

tering. This includes determination of the factors which influence the decision

making, the types of decisions made for specific problems, and the circumstances

in which alternative decision can be made.

(e) To develop methods for solving the rostering problem modelled which are gen-

eral enough to apply to a wide range of similar problems, in both rostering,

scheduling, and the wider field of constraint optimisation.

(f) To investigate the subdivision of rostering problems into problem solving episodes,

particularly focusing on the order in which violations of constraints should be

repaired.

In addition to these personnel rostering research objectives the development of

CABAROST was required to meet these additional CBR objectives:

(a) Explore the use of CBR for complex real-world problems, particularly in the

problem domains of scheduling and planning, and combinatorial and constraint

optimisation in general.

(b) Determine the suitability of CBR as a methodology for solving rostering prob-

(c) Explore the issues of problem representation and generalisation. Problem solving

information in the case-base must be abstract and general enough that it can

be applied to a wide range of future problems.

(d) Develop a CBR system for storing, retrieving, and adapting rostering episodes

which imitate the behaviour of human experts. In order to evaluate the per-

1. introduction 9

formance of the CBR system methods for measuring the reasoning quality of a

case-base must also be developed.

(e) Integrate the CBR system with meta-heuristic search methods for personnel ros-

tering. Such methods should combine the abilities to successfully traverse the

solution space with the knowledge contained in the case-base.

(f) Develop a strategy for training the case-base with expert knowledge. This in-

volves both initial training (i.e. before the case-base is used) and on-going

training to improve the system’s performance.

(g) Investigate how the CBR system can learn from failed reasoning episodes. This is

vital if the methods are to be employed in real-world settings, where the chance

of poor-quality cases entering the case-base may exist.

(h) Investigate technologies for maintaining case-base performance. Methods for

tuning the representation of problems in cases to the contents of the case-base

should be developed.

1.4 Layout of the Thesis

The chapters of this thesis describe how the research objectives were achieved. In

Part I of the thesis background information will be given, describing the current

state of research in the fields of personnel rostering and case-based reasoning. Part

II describes the problem model and details of the CABAROST system. In Part III

integrated approaches which combine CABAROST with two meta-heuristic search

methods are presented. The thesis concludes with a discussion and overall evaluation

of the research.

The individual chapters of the thesis are summarised as follows:

- Chapter 2 ‘Personnel Rostering Problems’ : An overview of the large body of re-

search on automated methods for personnel rostering is presented. The problem

1. introduction 10

types are classified based on a variety of factors and the dimensions of the prob-

lem are defined. A comprehensive review of the most important methods in the

literature is given.

- Chapter 3 ‘Case-Based Reasoning’ : In this chapter case-based reasoning is intro-

duced as a framework for imitating human reasoning. The key research areas

in the field are defined. The chapter concludes with a review of the use of CBR

in scheduling and planning problem domains.

- Chapter 4 ‘A Nurse Rostering Model’ : The problem of rostering nurses at the

Queen’s Medical Centre University Hospital NHS1 Trust, Nottingham, UK (ab-

breviated as QMC herein) is presented as a constraint optimisation problem.

The problem is defined in terms of an extensive set of customisable constraints

which the nurse managers would like to see satisfied. Different types of simple

‘repair’ operations for solving constraint violations are defined. Sample data

from the QMC is presented.

- Chapter 5 ‘Case-Based Repair Generation’ : CABAROST is presented as a method

for storing examples of constraint violations and corresponding repairs. The

issues of abstraction and generalisation are addressed so that the information

about violations and repairs can be widely applied to future problems. The im-

portant process of case retrieval and solution adaptation are described. During

retrieval cases are found in the case-base containing violations which are similar

to the violation which is being repaired. The repairs stored in the retrieved

cases must be adapted before they can be used to solve the new violation. The

chapter includes a discussion of the methodologies used to train the case-base

using examples of real-world rostering decisions from the QMC. Some basic

experiments are used to determine how closely CABAROST can imitate the

decision making of experts.

- Chapter 6 ‘Violation Features and Weighting’ : A large number of different in-

dices were used to describe the violations inside cases in CABAROST. Chapter

1National Health Service

1. introduction 11

6 presents an automated method which simultaneously chooses good quality

problem features and determines their relative importance based on the cases

in the case-base.

- Chapter 7 ‘Combining CABAROST with Tabu Search’ : CABAROST is used to

generate repairs to individual constraint violations. In Chapter 7, algorithms

are presented which use CABAROST in an iterative fashion to attempt to solve

all of the constraint violations in a problem. These algorithms are based on

the commonly used meta-heuristic tabu-search. Tabu search algorithms which

incorporate CABAROST generated repairs perform considerably better than

those which employ random repair generation strategies.

- Chapter 8 ‘A Memetic Algorithm for Determining High Quality Repair Orderings’ :

This chapter deals with the important issue of repair ordering for highly con-

strained rostering problems. A memetic algorithm is used to evolve good qual-

ity sequences of CABAROST generated repairs. In addition to the problem of

repair ordering, this hybrid method also addresses the issue of learning from

failure. The algorithm keeps a memory of violations which have been repaired

and can detect when a violation ‘re-appears’ in the roster. When this occurs the

case which was originally used to repair it is deemed to have failed, and a case-

weighting strategy is used to penalise it accordingly. An increase in case weight

reduces the chance that the case will be erroneously used for future problems.

The thesis concludes with a discussion about the effectiveness of the methods

presented and an analysis of how the research objectives were met. Suggestions are

given for future research which could continue the work presented in this thesis.

Part I

Background

Chapter 2

Personnel Rostering Problems

2.1 Introduction

Rostering problems can be found in almost all working environments but the most

interesting and difficult problems are found in organisations which employ large num-

bers of staff with multiple skills. The quality of the rosters produced for these or-

ganisations can have a tremendous impact on their survival. The financial success of

private companies is dependent on the conflicting factors of cost-effective man-power

utilisation and the wider effects of staff satisfaction and morale [229]. In the public

health-care and education sectors the cost of staffing must be balanced with com-

plex skill requirements and effective personnel management in order to best serve the

needs patients and students [208].

Personnel rostering can be defined to be the problem of placing resources (staff

members), subject to constraints, into slots in a pattern, where the pattern denotes a

set of legal shifts defined in terms of work that needs to be done [251]. A wide variety

of constraints can be imposed on rosters depending on the legal, management, and

staffing requirements of individual organisations. The measurement of the quality of

rosters is also highly dependent on the organisation they are created for. In most

real-life problems this measurement must represent a wide variety of different factors.

The class of personnel rostering problems is a sub-set of the wider class of schedul-

ing problems [251]. In fact, they form some of the most highly constrained, and

therefore difficult to solve, examples of scheduling problems. This chapter does not

2. personnel rostering problems 14

attempt to cover the literature in the wider field of scheduling, except where it is

directly applicable to personnel rostering. In general the nouns roster, schedule, and

timetable will be used interchangeably to describe the working pattern allocated to a

member of staff or group of staff members over a period of time.

Research into personnel rostering problems can be found in the literature from

the mid 1960s. Many examples of rostering problems, and the methods used to solve

them, have been published. In this chapter the main features of various problems will

be introduced and some of the key texts will be reviewed. Particular attention will

be paid to those approaches which represented significant progress in the field. The

reader is directed to any of the numerous review papers referenced in this chapter for

details of other, less significant, advances.

The methods which will be introduced later in this thesis were developed for

rostering nurses in hospitals. Consequently, the focus of this chapter will be on

methods for the rostering of nurses rather than more general personnel scheduling.

However, the definitions used are generally applicable to the wider class of problems,

and important approaches from outside nurse rostering will be considered. Nurse

rostering problems were among the first to be investigated by computer scientists

and operational researchers and have been used as test problems for many different

approaches.

This chapter will be organised as follows. The various dimensions which can

be used to categorise and describe personnel rostering problems will be defined in

Section 2.2. In Section 2.3 a comprehensive review of the state-of-the-art in personnel

rostering will be presented.

2.2 Dimensions

2.2.1 Employees

There are a number of ways that employees can be modelled in automated rostering

systems. Often the choice of representation reflects the solution approach being used.

The key differences between them are highlighted here.

Staff Pool Size

Many authors distinguish between two distinct employee planning problems: staffing

and assignment [32, 229, 240]. Staffing problems include the determination of the

number of staff to employ according to some forecast of the demand over a period, the

mix of skills that are needed in a particular unit, the design of shifts (i.e. start time

and length) which effectively cover particular demand profiles, and the generation

of working regulations for all staff and for different subsets of staff. Assignment

problems include the generation of on-off working patterns (deciding when staff will

take days off) and the allocation of specific shifts to specific employees. Although

there is a large degree of inter-dependency between these issues most of the published

research focuses on one problem or the other. The case-based reasoning methodology

presented in this thesis addresses the assignment problem and so staffing problems

will be referred to only briefly in this chapter. Nevertheless, they are an important

and relevant research area as the solutions to staffing issues are passed directly on to

the assignment problems in the form of parameters and constraints [239].

Some methods do solve both staffing and assignment problems simultaneously.

Burns and Carter [63] present a heuristic rostering approach which used daily demand

statistics and some simple working time constraints to calculate both the number of

staff required and the cyclical off-day pattern that these nurses would follow. Beau-

mont [32] developed a mathematical approach which determined the number of staff

required over different periods (according to heavily fluctuating demand), the times

at which shifts should start, and generated schedules for individuals. Some nurse

rostering methods use a pool of float nurses to address the problem of shortages in

rosters either during scheduling or after assignment has taken place [162, 235, 239].

Most of the methods described in the literature roster single units at a time al-

though some do consider entire organisations. In assignment problems for hospitals,

staff size per ward is typically in the region of 5-50 nurses (e.g. 6 in a Canadian ward

[43], around 20 in some Belgian hospital wards [55], 30 in a UK hospital ward [10].

However, most methods are designed to allow variable staffing levels (constrained by

obvious computational implications for mathematically intensive approaches). Ap-

proaches that tackle institution-wide problems have been developed. Meisels and

Kaplansky [159] developed an agent-based approach which treats the hospital as a

controller and the individual staffing units as agents.

Representation

The representation of employees in rostering problems has evolved over the years.

In staffing problems there is usually no need to model actual employees because the

objective is to determine staff size. The emphasis on more flexible and staff-friendly

working regulations in recent years has enforced ‘individuality’ onto nurse rostering

problems. With this in mind, rostering methods can be classified as follows:

- Anonymous : employees are not modelled as individuals in the problem. Most cycli-

cal rostering method by nature must treat employees in this way. Warner and

Prawda [239] identify staffing patterns for individual wards which will minimise

certain objectives, and then suggest that these patterns be used as an input to

an assignment scheduler. Ozkarahan and Bailey [187] generate good quality,

feasible sets of shift patterns for a number of wards across a hospital and then

suggest that nurses (and wards) should choose from the pattern sets according

to their individual circumstances.

- Personal : employees are modelled explicitly in the system. This approach has

the advantage that individual preferences and requests can be modelled rather

than simply relying on generalised requests for all nurses or groups of nurses.

Most algorithms use this approach as it was recognised very early that personal

factors are vital to successful employment [229]. Warner developed a form-based

methodology for eliciting preference and request information from individual

employees - and introduced the concept of ‘fairness’ with regards to the influence

that nurses have over rostering decisions. Modern commercial systems such as

ANROM [235], ORBIS Dienstplan [162, 163, 164], and INTERDIP [4] allow

individual ‘account’ style representations of nurses - and provide the means by

which to specify complex planning regulations on a per-nurse basis.

Skills and Qualifications

Some of the most significant decisions that must be made during rostering are those

concerning the type of staff concerned [249, 250]. Employees can be categorised ac-

cording to a number of factors, including their basic qualifications, skills and training,

experience, and even personality, gender and nationality. Managers must make plan-

ning decisions concerning the substitutability of staff - that is the ability of staff

members of one category to take over the roles and responsibilities of those in other

categories. In the literature a wide range of approaches are considered. For the

purpose of this thesis the following categories have been identified:

- Single-skill class : all staff are ‘the same’ from the point of view of the scheduler.

- Disjoint skill classes : staff can be divided into classes according to a single criterion.

These problems can be further subdivided according to the specification of

global requirements for minimum staffing levels (irrespective of class). When

there are no such requirements then each class can be rostered separately thus

considerably decreasing the problem size for those methods with combinatorial

complexity [17, 84].

- Multi-skilled : individual staff members have a set of skills associated with them.

This scenario is typical of problems which take a task -oriented approach (see

below) where a particular staff members assignability to a task is dependent on

their skills [66, 102, 248]. Under a multi-skilled model staff members cannot be

subdivided easily into classes and this may have consequences for the tractability

of solution approaches.

- Hierarchical : staff members are arranged into a hierarchical structure. This struc-

ture may represent seniority (for example, qualified nurses outrank nursing

aides) or substitutability (staff members higher in the hierarchy can be used

when no staff of lower level are available, or vice versa), and commonly a com-

bination of both. Often penalties are associated with substitution of staff. This

can be used to represent dissatisfaction that senior staff may feel at taking on

remedial roles, or managerial dissatisfaction that unsuitably trained staff are

being given too much responsibility. Hierarchical structures can also be used

to represent supervisory roles, for example between qualified staff and trainees

[10, 157, 167].

- Mixed : the most flexible approaches in the literature adopt an approach which

mixes the multi-skill and hierarchical models [52]. In the method described in

this thesis a background hierarchy based on qualification levels is superimposed

on a set of skills for each nurse.

2.2.2 Shifts and Planning Period

The wide variety of real-world scheduling problems that can be found in organisations

has lead to a number of different approaches to representation. Representation issues

are inevitably a trade-off between the desire to realistically model real-world situations

and the demands of the scheduling methods employed (including the availability of

computational resources). In this section the differences between cyclical and non-

cyclical schedule, shift assignment types, and the length of the planning period will

be discussed.

Cyclical vs Non-cyclical

A cyclical roster is a common pattern of shifts which groups of employees follow

on a rotating basis. Many early automated rostering methods produced cyclical

schedules [8, 16, 30, 62, 68, 114, 152, 200, 228]. Tien and Kamiyama [229] define an

individual schedule as one assigned to a staff member which is different to that of

his/her co-workers. They define a common (cyclical, or rotational) schedule as one

which over time is the same as that of his/her co-workers. There are advantages to

using both approaches but it is noted that most modern scheduling systems focus on

the production of individual schedules.

Cyclical schedules have a number of advantages for both administrators and staff

members. These schedules need only be produced when there is a change in scheduling

requirement and so their maintenance overhead is reduced [49]. Furthermore, cyclic

schedules are predictable and allow staff to plan their social or family life around

working patterns which are certain well into the future. They are also perceived

as ‘fair’ because all staff members work unpopular shifts and work-stretches equally

[49, 152, 213]. From a mathematical point of view the formulation of problems is

simpler and many exact algorithms exist for the simple problems [63, 167, 213].

The most serious disadvantage of cyclical rostering is the lack of flexibility. In-

dividual preference are difficult to incorporate and disruptive events such as staff

training, sick-leave, and annual leave tend to distort cyclic patterns [49]. It is very

difficult to model fluctuations in supply and demand using cyclic approaches [211].

Cyclic approaches are suitable for organisations with small numbers of low-skilled

staff and for such environments are still used today. They have fallen out of the scope

of modern rostering research due to the focus on larger, more difficult, real-world

rostering problems.

This thesis describes a non-cyclical approach to personnel rostering. Non-cyclical

rosters allow greater flexibility at a cost of reduced regularity and predictability.

Most of the modern, large scale rostering approaches described in this chapter are

for non-cyclical problems. Despite the lack of regularity and predictability of non-

cyclical rosters, employees appear to prefer their ad hoc nature [235]. Non-cyclical

rosters can more easily handle staff preferences and requests, fluctuating demand,

and disruptions due to staff absence.

Assignment Types

The traditional method for representing rosters for manual nurse scheduling is as a

two-dimensional grid with each row representing the roster, over the rostering period,

for a single nurse. Columns represented a subdivision of the rostering period into

(usually) even lengths of time (e.g. one day). In other applications columns might

represent tasks. For automated nurse rostering Cheang et al. [70] distinguish between

three ‘views’:

1. nurse-day views represent directly the two-dimensional grid used in manual

rostering where each variable is assigned a value describing the shift assigned

to a nurse on a particular day. For simple models binary variables are used

describing on (1) and off (0) assignments. More realistic models assign different

values according to the time of day that the staff member is to work.

2. nurse-task (or nurse-time) views are generalisations of nurse-day views. Each

variable represents the assignment of a nurse to a particular task or time-period

in the planning period.

3. nurse-shift pattern views use variables to assign nurse to pre-defined shift pat-

terns. The shift-patterns are designed to be feasible with respect to certain

hard constraints and can be pre-evaluated with respect to their satisfaction of

soft-constraints.

Nurse-day views are used most often throughout the literature [4, 26, 110, 121,

167]. Nurse-task views have been used for some nurse rostering problems [123, 240]

and the generalised concept is used frequently in the wider-employee rostering lit-

erature for task-oriented problems such as manufacturing [233] and transportation

[67, 227]. Pattern-oriented rostering can infer considerable computation advantages

due to the reduction in solution space [10, 57, 83, 84, 152, 208] but at the cost increased

pre-processing demands (many shift patterns must be established and evaluated be-

fore scheduling can commence). In general, it is difficult to ascribe advantages and

disadvantages to the choice of representation as it is highly dependent on the nature

of the problem domain.

The length of shift patterns used depends on the organisation involved. Shifts can

be categorised as:

- On-Day/Off-day : Staff either work or they are off. For non-continuous operations

this represents the definition of a single shift type [62]. For continuous opera-

tions establishing on-off days can be a management task which is solved before

individual shifts or tasks are assigned [229].

- Fixed Shifts : The working day is subdivided into disjoint or overlapping shifts of

fixed starting time and length. Many hospitals operate on a three shift system

with day, evening, and night shifts defined [17, 52, 235]. This author found

that in British hospitals overlapping shift types are often used to accommodate

part-time staff.

- Variable shifts : In very flexible systems shifts’ starting and ending times are not

defined a priori but are established as part of the scheduling process[23, 179,

187, 198].

- Tasks : Task assignment models are used for scheduling staff with specialised skills

and qualifications [94, 176, 233, 248]. These models often specify working times

implicitly by the starting and ending times of the assigned tasks.

Planning Period

The length of the planning period has obvious computation implications. Many early

methods were severely restricted by the limited computing power available - in 1972

Warner and Prawda [239] limited this period to 4 days although this was blamed

on limited demand forecasting. Cyclic scheduling approaches over-came this to some

extent because working patterns did not need to be established for each individual

[49]. Non-cyclic scheduling methods usually tackle problems over four weeks (or

one month) [5, 7, 20, 43, 68, 121, 156, 162, 181, 215]. Other planning periods used

include one week [22, 66, 72, 84, 119, 145], two weeks [17, 46, 178], three weeks

[13], six weeks [29, 168, 240], twenty one weeks [111], and up to a year [45, 68]. A

characteristic of modern, well designed systems is the ability to specify the planning

period dynamically [52, 187].

2.2.3 Constraints

Real world rostering problems are particularly difficult to solve because of the large

sets of constraints which are imposed on them. These constraints are usually con-

flicting - satisfying one constraint may lead to another constraint being violated in all

but a handful of solutions [165]. For most real-world problems satisfaction of every

constraint is impossible and so methods for relaxing or weighting constraints must be

considered [164].

In this section the focus will be on constraints with significant operational rel-

evance. Such constraints act as a high level description of the rostering problems.

Constraints used by individual methods for mathematical modelling purposes, such

as those which specify that a member of staff cannot be in two different places at the

same time, will not be described here.

Coverage and Demand

From a management perspective ensuring that an adequate number of staff are avail-

able to meet the demands of the organisation is perhaps the most fundamental re-

quirement of rostering algorithms. Aside from employment costs, which are generally

measured as objectives to be minimised if they are included, meeting demand for

employee time is the key element used to measure the performance of staffing policy.

This author notes that all of the personnel rostering methods he has encountered

include coverage constraints of some kind. Coverage constraints define the numbers

of staff required over certain periods, or for certain tasks, and the mix of skills that

those staff members must possess.

Establishing coverage constraints for a rostering problem involves determining

the demand for employee time and skills. For nurse rostering problems this can be

measured in terms of patient numbers and the seriousness of the conditions being

treated [211]. Call centres measure the number of incoming phone-calls they get over

their hours of operation to determine how many operators to employ - and they must

ensure that there is sufficient redundancy to cope with periods of peak traffic [22].

The methods used for defining coverage requirements from historical demand are

beyond the scope of this thesis. It is assumed for the majority of personnel rostering

algorithms that these coverage requirements are fixed.

Coverage constraints generally define at least one of the following levels:

- Minimum Requirement : The smallest number of staff or the lowest ‘quality’ mix of

staff possible for the operation of the unit. For most methods this level must

be met for the produced roster to be considered feasible [4, 7, 29, 52, 123, 125,

155, 164, 219, 239].

- Ideal Requirement : The level of staffing which allows the unit to run comfortably.

This level indicates desired staff numbers for periods of average demand and

may not be met in circumstances such as staff shortages [52, 164].

- Maximum Limit : The largest number of staff that may be working over specified

times [29, 52, 120, 219]. This constraint is rarely stated explicitly and for units

with static staff levels may be ignored (usually because it is satisfied due to

strict temporal constraints). In some models this constraint is treated as an

objective - i.e. to minimise staffing cost through surplus [187].

Time related

There are a large number of different temporal constraints which can be applied to

rostering problems. These constraints are measured over varying periods of times

and for some real world problems are very detailed. The main types of time-related

constraints are listed here:

- Working hours : minimum/maximum hours that a staff member may work over

a set period (usually over one or two weeks but could be anything between a

single day through to a year [75] in some examples) [4, 55, 64, 73, 135, 178].

Minimum hours are important for problems where staff members have guaran-

tees stipulated in their employment contracts (such as in the problem described

in this thesis).

- Rest periods : minimum number of hours between working shifts [4, 55, 64, 73, 107,

- Consecutive shifts/days : minimum, maximum, or exact number of shifts or days

that can be worked in a row [43, 62, 64, 72, 135, 234]. Some problems also

specify the minimum, maximum, or exact number of shifts of a specific type

that can be worked in a row (e.g. night shifts) [72, 73, 122].

- Shift Patterns : patterns of shift types which should not be assigned. These could

be illegal pairs of shifts or undesirable shift patterns of larger length [55, 73].

- Weekends : minimum/maximum number of consecutive weekends or the maximum

number that can be worked over a period [62]. Some problems specify that staff

members must work both days of the weekend so that a split of the weekend

does not occur [43, 55, 62, 64, 72, 178, 234].

- Holidays : public holidays, annual leave, study periods, and other forms of pre-

dictable unavailability [53, 72].

- Historical : constraints which are specified over periods larger than the current

planning period such as yearly working hours constraints or on-going weekend

constraints [4, 53, 73, 135, 239].

Personnel Preferences and Requests

Low staff morale can have very negative consequences for an organisation. In health-

care institutions it can have tremendous implications on the quality of patient care

[74, 169, 210]. Poorly designed rosters can cause considerable staff dissatisfaction and

lead to increased absenteeism and staff turnover [185]. One of the most important

factors affecting staff satisfaction is the level of perceived involvement of staff members

in the rostering process [210].

Staff members’ requests for individual days off, or specific shifts on certain days,

are allowed in many of the rostering methods described in the literature [4, 43, 121,

135, 230, 234, 239]. The treatment of these requests varies considerably between

approaches and numerous attempts to ensure ‘fairness’ of request allocation have

been proposed [17, 91, 239].

Requests usually represent ‘one-off’ constraints that do not occur on a regular

basis. Some methods include additional constraints which specify shift patterns which

staff prefer not to be assigned that are then applied on a regular basis. Warner

[240] introduced a questionnaire-based system which allowed nurses to weight certain

possible roster characteristics (e.g. single days off, long work stretches), according to

their ‘aversion’ to them. To ensure fairness each nurse was given a bank of 50 aversion

weights which they could spread throughout the choices. Arthur and Ravindran

[17] described a similar approach which involved nurses ranking characteristics on

a scale of 1 to 5. The rostering problem described by Dowsland [84] assessed a

large number of possible shift patterns according to their satisfaction of a number of

constraints, including nurse preferences and requests, and used the evaluated costs in

the objective function. The ANROM method [235] allows contracts to be specified on

a per-nurse basis which can include complex specifications of preference for certain

working patterns.

Hard and Soft Constraints

It is common throughout the literature to describe constraints as being either hard

or soft [52, 60]. Hard constraints are strictly applied and must be satisfied in order

for a roster to be considered feasible. Soft constraints are more flexible and do not

have to be satisfied. They are frequently used to describe roster quality through the

inclusion of penalty weights into objective functions.

The definition of constraints as hard or soft varies considerably between problems

and methods. Coverage constraints are generally considered to be hard - especially

when they define minimum coverage requirements [10, 52, 55, 125, 157, 202, 239].

However some methods do treat them as soft constraints [71, 163, 164, 168, 182,

183, 240]. Some temporal constraints, such as working hours that are set by law,

are described as hard constraints [43, 168], but usually the large number of temporal

constraints defined for most real world problems ensure they are treated as soft [53,

158, 164, 168, 240]. All methods surveyed treat staff preferences and requests as soft

constraints.

2.2.4 Objectives

Nearly all automated methods treat rostering as an optimisation problem and use

an objective function to define roster quality [70]. Early mathematical program-

ming methods are formulated to guarantee optimality with respect to these objective

functions and a wide variety of different objectives were defined for each problem ad-

dressed. Heuristic methods also limited the definition of objectives. Meta-heuristics

and constraint methods are generally more flexible and define roster quality by aggre-

gating the penalties associated with violated constraints. The objectives and mod-

elling approaches used by each method are described in more detail in Section 2.3.

Some of the characteristics of roster used to define objective or penalty functions

- Minimising deviation from demand : can be defined as positive or negative deviation

or both [17, 43, 95, 186, 187, 239].

- Minimising number of employees : for methods which also plan staffing levels [13,

17, 21, 86].

- Minimising personnel cost : can restrict the number of employees planned, or the

higher cost of assigning many hours to highly qualified staff [164, 226].

- Minimising violation of soft constraints : usually using a penalty function [17, 55,

2.3 Rostering Approaches

In this section the most significant methods that have been published in the scientific

literature will be discussed. The papers described here represent a fraction of those

that have been written - in one recent review over 700 published articles were refer-

enced [89]. This author notes that there is a high degree of duplication of solution

approaches. For the purpose of this thesis papers have been selected as representatives

of their class.

As the focus of this thesis is nurse rostering most of the papers reviewed address

these problems. However a number of important general rostering methods are also

considered. Significant approaches from other domains, especially transportation, are

also presented as examples of some of the dimensions discussed in the previous section.

The papers also focus on assignment as opposed to staffing problems, although a

number of mixed approaches will be discussed.

The papers presented are categorised according to the solution approach used.

The approaches will be discussed in the order: mathematical programming, constraint

programming, artificial intelligence, decision support systems, heuristic methods, and

finally meta-heuristic search.

2.3.1 Survey Papers

Before describing some of the approaches to personnel rostering that can be found in

the literature some survey papers will be examined. Particular attention will be paid

to both the classifications that are used by each paper and the authors’ observations

of trends and potential research directions.

Tien and Kamiyama [229] subdivide the manpower scheduling problem into a five-

stage framework. They identify some key difference in rostering approaches including

the distinction between cyclical and non-cyclical rosters, the length of the planning

period, and the nature of the constraints applied. They define constraints into two

categories: those which are “inherent to the structure of the [manpower rostering]

problem” (including coverage requirements and legal working hours), and those con-

straints which are specific to the application of the problem model. The five stages

of the framework are:

1. Temporal manpower requirements : the coverage requirements for each shift on

each day are determined according to the demand profile.

2. Total manpower requirements : the total number of staff (of each type) that

is required to meet both the coverage requirements and the other temporal

constraints is determined.

3. Identification of recreation blocks : the identification of periods in which em-

ployees are not assigned shifts - including the identification of single days and

weekends, and the maximum and minimum number of consecutive days that

can be taken off.

4. Scheduling the recreation blocks : the blocks identified in Stage 3 are assigned to

specific nurses either individual or as part of a rotating cyclic roster.

5. Assignment of shift schedule: the specific shifts assigned to nurses on their

assigned on-days are determined.

The authors present a number of models that are used by various approaches

to solve each of these stages. They also note that some of the stages are solved

simultaneously. The CBR-based methods introduced in this thesis do not address

Stages 1 and 2 - these values are presented as parameters of the problem being

solved. Stages 3 to 5 are addressed simultaneously - although the formulation does

not explicitly determine recreation blocks as identifiable elements.

Bradley and Martin [49] identify the importance of good personnel scheduling on

staff recruitment and retention. They define three basic rostering decisions: staffing,

personnel scheduling, and allocation. The authors investigate the relationship be-

tween these decisions - and the impact that each has on the final solution. They

conclude that there exists extensive interdependencies and suggest that a feedback

mechanism should be included to ensure that information about deficiencies in sched-

ules affects future staffing decisions. The authors also determined two features by

which to uniquely classify rostering algorithms. The first is the cyclical/non-cyclical

nature of the scheduling output, and the second is the underlying solution technique

used (heuristic, mathematical programming, or self-scheduling). The paper concludes

with an analysis of the issues that had not yet been addressed by scheduling algo-

rithms, including the design of shifts that meet the requirements of both staff and

employers.

Sitompul and Randhawa [211] surveyed the different models for the personnel

rostering problem, noting that heuristic, optimisation (i.e. mathematical and goal

programming), and artificial intelligence methods were employed. They characterise

rostering problems as being either cyclical or non-cyclical and note the complex sets

of constraints that can be required by hospitals. Their paper investigates the use of

decision support systems (DSS) for nurse rostering. In particular the authors note

that nurse rostering problems may have elements with a large degree of structural

consistency, but that some components require subjective assessment. They identify

four characteristics of a DSS for nurse rostering:

1. Problem models are less structured.

2. Models and solution methods are integrated with database access and retrieval

functions.

3. Ease of use and interactivity.

4. Flexibility and adaptability.

These characteristics lead to the design of systems which are less rigid in structure

and more in-tune with the needs of the user. The elements of a DSS noted in this

paper can be found in many of the most successful nurse rostering methods developed

in the last decade. Of particular relevance to this thesis is the recognition that the

concept of decision making in personnel rostering is key to the generation of rosters

which satisfy both staff and managers.

Warner et al. [241] divides nurse management into two conflicting sets of issues:

patient-oriented and employee-oriented. The patient-oriented issues include the pa-

tient care philosophy used within the hospital which covers such factors as care plans,

necessary staff contact, and the assignment of individual staff members to individual

patients. The authors identify six employee-oriented issues which focus on different

staff management issues such as employee cost budgeting, staffing requirement deter-

mination, and long and short range scheduling. The paper includes a description of

what hospital administrators look for in rostering systems. Such qualities include ef-

fective scheduling of nurses and reporting functionality which measures and monitors

the performance and utilisation of staffing resources.

Jelinek and Kavois [124] wrote a survey paper with a similar format to that of

Warner et al. and Sitompul and Randhawa (in fact it was published in the same

publication). They make prediction about the future of nurse staffing and rostering.

The emphasis is placed on the effective management of staffing information and on

interactive decision-based scheduling systems.

Wren [251] defined the relationship between various scheduling problems, includ-

ing personnel rostering. He identifies the key aspects which scheduling problems share

in common:

- Objects : are the people, vehicles, classes, examinations, machines, jobs in a factory,

et cetera that must be linked in space and time in order to solve a particular

problem.

- Pattern: an ordering of events (e.g. working shifts) which may be created as part of

the scheduling process, or may be pre-defined to form a feasible set from which

members can be chosen to create overall schedules.

- Constraint : define physical or legal relationships between objects and between ob-

jects and patterns. The author notes that constraints can be seen in two dif-

ferent ways. Constraints may be seen as prescriptive rules which hinder the

achievement of the goals. However, they may also be seen as integral to the

problem specification and may help guide the user or solver towards a solu-

tion. It is also noted that decisions about relaxing constraints may need to be

included into the scheduling process in order to find good solutions.

- Schedule: includes all of the spacial and temporal information necessary to describe

how a task must be completed referring to the placement of objects in a pattern.

The author notes that the words ‘schedule’, ‘sequence’, and ‘timetable’ are used

as if they were synonymous in many papers. They give definitions for the latter

two terms which show the subtle difference in meaning. A timetable gives the

times at which certain events are to take place and not necessarily where or

how. Sequences define the order in which events will occur (or resources will be

processed) but without necessarily including information about how long the

events will take.

Wren goes on to define rostering as “the placing, subject to constraints, of re-

sources into slots in a pattern. One may seek to minimise some objective, or simply

to obtain a feasible allocation. Often the resources will rotate through a roster.” Im-

portantly, Wren noted that many rostering problems do not have well-defined goals

- either in terms of satisfaction of constraints or minimising (maximising) certain

objectives. He asserts that the use of non-optimising methods can be justified when

different players (decision makers) have differing goals and expectations about the

solution outcomes.

Methods that are commonly used to solve scheduling problems are identified.

These include mathematical and heuristic methods and one of the first review of

the impact of meta-heuristic methods on the field of scheduling. Wren concluded

that meta-heuristic methods, including simulated annealing, tabu search, evolutionary

algorithms, heuristic based constraint logic programming, and ant algorithms, have

been largely successful despite their inability to guarantee optimality. Wren concludes

his survey with the observation that many researchers have attempted to ‘carry over’

approaches from one branch of scheduling to another with little apparent success.

He suggests that ‘cross-fertilisation’ may be feasible between closely related branches

(e.g. education timetabling and personnel rostering) and not between problems with

considerable structural differences.

Spyropoulos [221] wrote a review on the use of artificial intelligence techniques for

planning and scheduling problems in hospitals. Spyropoulos identifies personnel ros-

tering as just one of the many problems in the hospital environment. Other problems

include therapeutic planning, drug logistics, ambulance scheduling, and operation

theatres scheduling. Importantly, Spyropoulos notes that these problems are not in-

dependent but must be substantially integrated in successful hospitals. A number of

different artificial intelligence techniques have been used for these problems including,

agent systems, machine-learning, rule-based systems, and constraint reasoning.

Silvestro and Silvestro [210] reviewed the nature of manual rostering in British

hospitals. They identified three different approaches to nurse rostering for hospital

units (e.g. wards):

1. Departmental rostering which is conducted by a senior staff member who pro-

duces the roster for the whole unit.

2. Team rostering, where staff are divided into teams. Within each team the staff

members cooperate in order to produce team rosters. The team rosters are

coordinated for the whole unit either through discussions between team leaders

or through the mediation of the unit manager.

3. Self -rostering, where the roster is prepared by the unit staff, usually overseen

by a senior staff member.

These three approaches sat on a continuum with self-rostering and departmental

rostering at either extreme and team rostering in the centre. The paper reported

that staff satisfaction increase towards the self-rostering end of the continuum but

that self-roster was unable to cope with complex rostering problems. Departmental

rostering was perceived as autocratic by staff members. It was also open to the risks

of favouritism and, if unsatisfactory rosters were produced, increased absenteeism.

However, departmental rosters were well coordinated and balanced and allowed for

far greater complexity with regards to constraints, shift size, and skill mix. It was

reported that the team-rostering/self-rostering rostering approaches were increasingly

used within British hospitals.

Cheang et al. [70] provided a bibliographic survey of nurse rostering problems.

They described some of the various formulations that can be used to represent ros-

tering problems mathematically. Their classification method for the representation

of decision variables in nurse rostering problems is described in Section 2.2.2. The

authors also describe many of the different constraint types which are found in ros-

tering problems (see Section 2.2.3) and the standard techniques for measuring roster

quality. They go on to describe a number of solution techniques and discuss such

issues as problem initialisation and post/pre-planning processing.

Ernst et al. [89, 90] present a review of staffing scheduling over a wide-range of

industrial applications. They describe the rostering problems for the transportation

industry, health-care systems, emergency services, call centres, hotels, restaurants,

financial services, tourism, venue management, manufacturing, and retail stores. The

authors argue that rostering methods need to be further generalised in order to in-

crease their flexibility and applicability. In [89] the authors give an annotated biblio-

graphical survey of over 700 papers. This massive survey classifies papers according

to the application industry and the method used to solve the problems.

Blochliger [47] give a basic introduction to staff rostering. They present a simple

generalised model of rostering problems intended as a tutorial for those new to the

field. The level of detail included in this paper is fairly limited, but it does give a

good summary of the basics of personnel scheduling. Alfares [14] present a review of

tour-scheduling literature. The tour-scheduling problem involves determining both

the hours during the day and the days during the week for each employee. Solution

techniques for these problems fall into one of ten categories: manual production,

integer programming, implicit modelling, decomposition, goal programming, working

set generation, LP-based solution, construction and improvement, metaheuristics,

and other approaches. Kohl and Karisch [132] review airline crew rostering methods

and describe in detail the different mathematical formulations of these problems.

A comprehensive review of survey and other key papers for the nurse rostering

problem has been written by Burke et al. [56] for publication shortly after the sub-

mission of this thesis.

2.3.2 Mathematical Programming Methods

Mathematical programming methods were among the first to be used to solve person-

nel rostering problems. They are capable of finding provably optimal solutions but in

general require restrictive models which satisfy sets of basic assumptions. As a result

they tend to be inflexible and incapable of solving realistic problems [235]. Their

combinatorial complexity also means that they tend to be slow at solving all but the

smallest of problems. Nevertheless, they have been successfully applied to specific

problems and have been used in a number of commercial products - particularly for

industrial domains in which guaranteed optimality is vital.

Most of the methods described here use integer or linear programming models.

The formulation of rostering problems in this way has the advantage that readily

available commercial LP/IP solver software such as CPLEX [116] or Microsoft Excel

Solver [98] can be used to find solutions. These methods are capable of finding optimal

solutions with respect to a single objective. Multiple objectives are tackled using goal

programming methods. These methods attempt to find solutions which satisfy sets of

objectives according to target levels and priority relationships. They provide a degree

of flexibility which the single objective methods lack.

Integer and Linear Programming

Many examples of these algorithms exist in the literature across a wide range of

industrial applications. The methods in general can be characterised by their focus on

a particular objective which must be maximised or minimised. Many of the methods

employ problem subdivision techniques and/or heuristic assignment rules in order

to reduce their complexity and thus increase their applicability to realistically sized

problems.

Warner and Prawda [239] formulated a mixed integer quadratic programming

problem to identify shift patterns which minimise the shortage costs in nursing care

over a rostering period over a number of wards. It also calculates the numbers of

each skill class that will be assigned to each ward for each shift. The problem is

constrained by the total number of nurses of each skill class that are employed by the

hospital and minimum coverage requirements for the wards during shifts. A degree of

substitutability is allowed in the model with which allows nurses of one skill class to

take over the roles of another skill class which is in high demand. This is constrained

by a fixed substitutability level which limits how many times it can occur in a given

period. The model is generally applied to rostering periods of four days or less. The

method is anonymous in the sense that no personal information about individual

nurses is used - and hence no shift requests or individual preferences are considered.

The research is extended by Warner [240] with the introduction concepts of

scheduling quality. A roster is measured according to its “desirability as judged

by the nurse who will have to work it”. A number of factors are incorporated in this

measure including the number of weekends off, periods of consecutive shifts, single

days on, as well as by taking into account the nurses’ requests for days off. A notion

of ‘fairness’ is also defined which measures the quality of nurses’ schedules against

those of their colleagues. The objective function is formulated as a summation of

penalty weights which apply to the violation of the constraints and thus represents

the penalty cost of the schedules. Nurses are assigned a certain number of weights,

proportional to the number of hours they work per period, which they then use to

weight their aversion to particular rostering features (e.g. weekends, split shifts, single

days on, etc.). This system allows for very fair evaluation of the quality of schedules.

The method is partially interactive by encouraging certain rostering decisions to be

made before using the automated algorithm. The algorithm was implemented in a

number of US hospitals.

Abernathy et al. [5] use a stochastic programming model to solve the problem

of both planning staff numbers according to fluctuating demand and designing fixed-

period rosters. They represent the demand for nurses for a particular day and location

as a random variable. Two types of decisions are made by the algorithm. Policy

decisions dictate the numbers of nurses required on the ward over the rostering period.

Allocation decisions use the coverage requirements set out by the established policy

as constraints for the generation of rosters for the nurses. An complicated iterative

algorithm is developed which minimises the cost of employing nurses according to

the policy decisions and the deviation from the daily demand levels according to the

shift allocations of the nurses’ rosters. The authors suggest that their algorithm is

particularly useful for organisations with large and uncertain variations in the demand

for staff.

Trivedi and Warner [231] present a branch and bound algorithm for allocation of

float nurses for short-term rostering. This method is used to assign a pool of float

nurses to various wards in a hospital on a real-time basis (i.e. at the start of each

shift). A measurement of severity is used in order to determine the requirement level

of each ward according to the shortages that exist (due to staff absence, increased

patient load, etc.). The branch and bound method then searches for ward allocations

which minimise the shortage severity of the whole hospital, and also minimise the

variance in the shortage severity over all the wards. The method is presented as

a decision support tool which helps managers determine the severity of shortages

according to the reports they receive from the ward managers. The algorithm is

tested in a hospital over 5 nursing units.

Miller et al. [168] introduces a subdivision of constraints into hard constraints

(called the feasibility set) and soft constraints (called non-binding). The hard con-

straints dictate minimum staffing levels, restrictions on the work stretches, and the

maximum/minimum numbers of days nurses should work. The soft constraints set

stricter requirements to the hard constraints which should be met if possible. It is of

interest that individual nurses shift requests are given the highest importance in the

model. Even the hard constraints may be violated if this will preserve the nurses’ pref-

erences. This results in penalties being used to measure violation of hard constraints

and the possibility that feasible solutions can not be obtained. Overall the algorithm

attempts to minimise the sum of the penalties of the violations of constraints by as-

signing schedules consisting of days on or off. A cyclic-descent algorithm is used to

find near-optimal solutions with guaranteed convergence .

Bailey and Field [23] defines the concept of ‘flexshifts’ for problems which nor-

mally subdivide days into three 8-hour shifts. Shifts of 6, 8, and 10 hours are allocated

to employees in a linear programming model, allowing individual staff members to

choose shift lengths that suit them. Rosters are optimised by minimising the cost

of labour and the idle-times of staff members. In an alternative formulation Bailey

[22] presents an algorithm which treats rostering as the integration of two processes.

The first process involves allocating days on to employees over a period according to

constraints on the working patterns. The second process is that of determining when

on each of the allocated days on each employee must start. This second process aims

to schedule staff in order to meet fluctuating levels of demand over each day. The

method utilises algorithms that had been developed previously for the separate prob-

lems arguing that the process must be integrated because their constraints interact.

The problem is modelled as an integer program which minimises staffing costs and

customer inconvenience due to understaffing.

Mason and Smith [156] developed an integer programming model for solving nurse

rostering problems. It considers a number of constraints including shift costs which in-

dicate nurses’ individual dislike of particular shifts, ‘shift transitions’ which encourage

consecutive shifts to start at the same time if possible but with the second shift later

otherwise, and penalty costs for particular combinations of days on and off. Variation

from target hours for each employee is also penalised. Solutions are calculated using

a graph modelling approach where nodes represent the shift assignments or days off.

Constraint penalties are dynamically modelled by weighting the arcs between these

nodes. The problem is thus that of finding lowest cost paths. The method allows for

non-overlapping skill definitions. The method is tested on data involving 86 nurses

with 7 skill levels, 5 shift types over 28 day periods.

Jaumard et al. [123] gives a linear programming model for rostering nurses which

satisfies coverage demands whilst minimising salary costs and maximising staff pref-

erence and the balance of shift assignments over teams. A variety of overlapping

shift types are defined which are used to cover a set of irregularly sized demand pe-

riods. Nurse preference is incorporated into the objective function and is weighted

according to seniority. Coverage constraints are complex in that they specify combi-

nations of nurse skills that must be present on the wards at any time. This method

allows nurses to have complex contractual agreements which can dictate very specific

working requirements. The algorithm was tested on data from a hospital in Canada.

Millar and Kiragu [167] introduce a mathematical model for cyclic and non-cyclic

rostering of nurses who work 12 hour shifts. This models an individual nurse’s ros-

ter as an alternating sequence of work-stretch and off-stretch patterns. The work-

stretches are used as nodes in a network used to generate an acyclic graph by which

the roster can be defined. The method allows for the definition of complex coverage

and working hours constraints. The method is implemented on CPLEX and tested on

small real-world problems. Although the method allows for the definition of reason-

ably complex problems it is not clear that it would be scalable to the large problems

encountered in many hospitals.

Bard and Purnomo [28, 29] give a combined linear programming and heuristic

model for solving nurse rostering problems with emphasis placed on satisfying in-

dividual requests whilst maintaining minimum coverage levels. The combination of

heuristic approaches with linear programming greatly reduces the computation time

required.

Eitzen et al. [88] present a method for solving workforce optimisation problems

with non-hierarchical, multi-skilled staff members. Because the skill structure is non-

hierarchical there is very little skill-substitution allowed. Employees are grouped

according to their core skills and the algorithm ensures that shift patterns are fair

within these groups. The algorithm simultaneously schedules the days off for each

staff member, constructs suitable sequences of tasks according to skill, and schedules

shifts to employees accordingly. The method attempts to minimise the number of

understaffed shifts whilst maximising roster fairness between employees. The method

is tested on data from a power station.

Other approaches have been developed for various staffing problems. Byrne and

Potts [65] developed an eight phase linear programming model for scheduling toll

collectors based on historical data about demand throughout 24 hour periods which

takes into account complicated working regulations for both full and part time staff.

Other LP/IP programming approaches have been used for the rostering of customs

staff [155], postal service staff [27] and staff at a container terminal [144].

The staffing of call centres has received specific attention in the literature and a

number of mathematical approaches have been used. Beaumont [32] used a mixed

integer approach to solve a call centre problem with very detailed demand statistics.

The problem model took into account staff cost, exchange capacity, and the cost of

having customers waiting. Integer programming methods are combined with sim-

ulation techniques by Henderson and Mason [108] and Atlason et al. [18]. These

methods use simulations to evaluate performance of rosters. The advantage of this

approach is that the complexity of the problem can be increased without the need to

calculate objective values on other models.

Mathematical approaches have been used in the domain of passenger transporta-

tion. Yan and Chang [253] developed a set-partitioning method which minimises

cockpit crew cost and which plans suitable pairings of cockpit crew members. Weir

and Johnson [244] introduced a three phase system to produce work-rest schedules

for flight crews by trying to take into account circadian rhythms. Freling et al. [96]

developed a flexible branch and bound system for airline and railway crew schedul-

ing which allowed a variety of complex constraints to be established by the user.

Other approaches include a CPLEX implementation of mixed integer programming

approach to roster crews on the London Underground [220], and a set-partitioning

approach to the problem of roster staff in a multiple depot system [48].

Goal Programming

Goal programming methods allow decision makers to include multiple goals (or ob-

jectives) into their models and then rank them according to their preference. A target

level can be set for each goal and algorithms then attempt to satisfy all the goals,

or if this is not possible they satisfy them in the order of priority set by the decision

maker. These methods tend to provide good decision making tools because they allow

users to manipulate the input variables in an intuitive and natural fashion.

Arthur and Ravindran [17] introduce a multiple objective nurse scheduling model.

Their method minimises staff size, minimises the deviation from desired staffing levels,

satisfies nurses’ shift requests, and minimises staff dissatisfaction (with respect to the

quality of shift sequence). They allow the decision maker to decide on the priority

ordering of these objectives according to their perception of the problem. The days

on and off are first assigned using a 0-1 goal programming model. A heuristic is then

used to assign specific shifts to nurses according to the existing on-off pattern. In their

model the authors allow non-overlapping skill classes to be established and the model

schedules these classes one at a time - there is no substitution of skill levels allowed. A

set of questionnaire forms are introduced in order to elicit information from both the

ward managers and the individual staff. These ask the nurses to indicate their level of

dislike for certain characteristic patterns (such as split weekends) and to mention any

specific shift requests that they might have. However, the method does not include

the fairness weighting that was used by Warner in [240] to ensure that all nurses

receive equal treatment. The method is tested on data from hospitals in the United

States.

Ozkarahan and Bailey [187] present a goal programming approach to set-covering

models. Their method aims to simultaneously minimise deviation from required

staffing levels, minimise the deviation from available staffing levels so that all staff

members are allocated enough shifts to satisfy their contractual arrangements, and

to ensure that the number of days on are sufficient to allow the number of shifts

required for each day to be covered. The authors present a number of alternative for-

mulations based on alterations to the objective functions for each goal. This allows

decision makers to choose the combination which is most suited to their problems.

The method finds on-off patterns and allocations of shifts before assigning these to

individual nurses. As a consequence the method does not define any constraints based

on individual preference or contractual agreements.

Berrada et al. [43] have developed a multi-objective model for real-world nurse

rostering problems. This model includes a number of goals which are essential soft

constraints which are satisfied if possible. These goals are to limit the number of

consecutive working days (to prevent long work-stretches), preventing nurses from

working single days-on in isolation (off-on-off situations), satisfying special requests

of individual nurses, and group days-off (including grouping them with weekends-off if

possible). A number of hard constraints are also specified including specific weekend

working patterns, the maximum number of weekly working days, and the uniform

distribution of any surpluses or shortages of nurses over the week days. However, two

major simplifications have been made which limit the applicability of the approach.

The first is that only one skill-class has been defined and the authors do not discuss

the possibility for extending the model to include multiple skills. The second is that

nurses must work the same shift on every on-day. This is a significant assumption

which does not reflect the flexibility required of staff in modern hospitals.

Blake and Carter [45] introduce a goal programming approach for allocating hos-

pital physicians within acute care units. The paper present model for the selection

of cases to assign to physicians according to their capabilities and interests. The

objectives are to maximise revenue for the hospital, ensure physicians receive their

agreed level of income over a period (usually year by year), and to meet physicians

preferences for a preferred and consistent mix and volume of cases. Experimental

results are presented on data from a hospital in Canada.

Azaiez and Sharif [20] present a 0-1 goal programming model for nurse scheduling

for Saudi Arabian hospitals. This model incorporates a large number of constraints

that are specific to the hospitals that were surveyed. Of interest is the requirement

that night shifts must constitute at least 25% of the total workload for each nurse.

The method subdivides the nurses into a number of subgroups and produces schedules

for each separately.

2.3.3 Constraint Programming Methods

Constraint programming methods are powerful artificial intelligence techniques which

have been used for many different combinatorial optimisation problems. The con-

straint satisfaction problem (CSP) is the problem of assigning the values belonging

to finite domains to the variables so as to satisfy all constraints which involve those

variables [138]. Constraint based methods can be programmed using a variety of tools

- from generic programming languages such as C++ or Prolog, specialised libraries

for constraint methods such as ILOG Solver [117] and CHIP [82], to dedicated con-

straint programming tools. Their greatest advantage is in the natural way in which

constraint information is modelled. This facilitates the generation of intuitive sys-

tems with interfaces that can directly communicate any difficulties that may exist

with a given formulation. Many of the methods discussed here have been successfully

incorporated into commercial rostering packages.

Darmoni et al. [79] present a software system called Horoplan for rostering nurses

in French hospitals. This method generates rosters over a 6 week period using a 9

step process. This process specifies when certain planning decisions are made and

the order in which parameters should be elicited from the concerned parties. Two

types of constraints are defined for the problem - hard constraints and ‘floppy’ (soft)

constraints. The hard constraints include satisfaction of daily coverage requirements

for each shift and the requirement for at least one day-off in every seven day period.

The soft constraints take into account a large range of temporal restrictions including

the long-term accounting requirements of French employment law.

Meisels et al. [157] present a flexible rostering method using a mixed constraint

network and rule-based approach. They argue that complex problems cannot be

solved by one method alone - constraint networks lack the flexibility to represent

real-world constraint information and the implicit nature of the knowledge in rule-

bases is difficult to process in a robust and predictable way. The authors establish a

set of non-trivial ‘generic explicit constraints’ which they believe are common to all

employee timetabling problems. These include constraints which control demand for

particular resource (skill) types and many types of temporal constraints, as well as

constraints necessary for the formulation of satisfiable constraints satisfaction prob-

lems. They introduce generic search heuristics and strategies for satisfying these

constraints. The rule-based part of the system is used to specify preference informa-

tion including individual shift requests as well as substitutability rules for employees

of different skill classes. The authors analyse a wide variety of real-world and theo-

retical problems and determine those combinations of constraints which are difficult

(or impossible) to solve. This method has been implemented in the commercial soft-

ware package TORANIT. TORANIT provides an interactive user interface which

guides users towards problem formulations which will be solvable by the system. In

a later paper [158] Meisels et al. describe an even more flexible approach which is

implemented in the EasyStaff software.

In a similar approach Meisels and Lusternik [161] investigate the nature of con-

straint networks of employee timetabling problems. This investigation focusses on

the relative difficulty of employee timetabling problems given different domain sizes.

In particular they discovered that for constraint networks an increase in the size of

the problem variables increases the likelihood that problems will be unsolvable. The

authors compare a number of different constraint processing techniques including for-

ward checking, conflict directed backjumping, and a combined approach. Detailed

results are presented based on experiments on randomly generated test problems. In

particular, the number of solutions that can be generated for problems of varying

characteristics is determined.

Two constraint based approaches have been developed for rostering problems in

Hong Kong hospitals. Cheng et al. [72] developed an algorithm which utilised a

redundant modelling approach in order to reduce the search time. This approach

analyses the set of constraints to determine those constraints which are redundant

because they are implied or ‘entailed’ by other constraints. This approach allows a

large number of constraints to be established with less risk of creating intractable

formulations. Chun et al. 2000 [73] describe the SRS nurse rostering system which

has been deployed across a large number of public hospitals in Hong Kong. This

approach is very practical and includes a large number of constraint definitions which

can be altered for individual problems.

Meyer auf’m Hofe [162, 164] has developed personnel rostering methods imple-

mented in the ORBIS Dienstplan software. His approach is among the most detailed

and flexible approaches in the literature. The system provide a library of different

constraint types which can be picked by the user - an approach which will be used

for the methods described in this thesis (albeit using a different mathematical for-

mulation). A large variety of different shift types can also be specified by the user

- including disjoint shifts and multiple similar shift types (e.g. shifts who’s starting

and ending times differ by half an hour). Constraint types include:

- Minimal and Preferred Crews: the system allows the user to specify the minimal

and preferred number of nurses required for each shift type. Similar shift types

can also be grouped to have the same requirement. Skill mix can be specified

by the user by specifying those nurses who can be used to fulfill a particular

requirement - although it is not clear if this is achieved through the labelling of

nurse skills or through explicit specification of individual nurses.

- Balancing Time Accounts: these provide a large variety of temporal constraints

and ensure that there is minimum deviation from the working time specified in

individual staff members’ contracts. These can be specified over time periods of

varying lengths. Constraints can also be specified which balance the overtime

given to nurses across the unit.

- Rest Times: dictate the minimum and preferred rest times between shifts.

- Working Time Models: A database of preferred shift sequences is stored. If these

sequences are not used then a penalty value can be used.

- Undesired Sequences of Shifts: Specifies shift sequences which will be penalised if

they appear in the roster.

The system integrates traditional branch and bound methods for solving con-

straint satisfaction problems with local search approaches. The result is a system

which has been successfully applied to real world problems and is currently in use in

many German hospitals.

This work is extended to try and rectify some problems with feasibility in the

existing model [163] . Soft constraint information is modelled as sets of ‘fuzzy con-

straints’. These fuzzy constraints allow the notion of partial satisfaction of constraints

to be handled efficiently. The extended approach also includes more efficient variable

ordering heuristics.

Scott and Simpson [208] have developed a nurse rostering algorithm which com-

bines case-based reasoning with constraint technologies. A case-base of good quality

shift patterns is stored and used to provide initial solutions. These solutions are then

refined and corrected using constraint methods. This approach is especially relevant

to this thesis and will be described in more detail in Chapter 3.

There are a number of other constraint approaches which are similar to those

mentioned here such as a cyclical scheduling system for producing staff timetables

for up to 150 people over a year long period [68], a multi-skill crew rostering system

implemented using the ILOG Solver to solve the problem of rostering technical crew

for TV productions [140], and the commercial software INTERDIP developed by

Abdennadher and Schlenker [4] for rostering nurses using a highly interactive user

interface.

2.3.4 Artificial Intelligence

A number of approaches have been developed using a variety of artificial intelligence

models. The methods that will be described in this thesis are based on the case-

based reasoning paradigm and approaches in rostering (including Scott and Simpson’s

CBR/CSP approach) and more general scheduling will be discussed in more detail in

Chapter 3. Some interesting rostering methods based on other artificial intelligence

concepts will be discussed here.

Meisels and Kaplansky [159] proposed a distributed agent based personnel roster-

ing system which combines constraint satisfaction methods with the agent paradigm.

This method is designed for organisations which consist of departments whose roster-

ing problems are solved locally and then combined in such a way that they conform

to a set of global (organisation-wide) constraints. The model consists of a Central

Agent which coordinates a search performed by a number of Scheduling Agents. The

Scheduling Agents generate weekly schedules for individual wards in a hospital us-

ing standard constraint techniques. A process of negotiation is then coordinated by

the Central Agent to modify the ward schedules so that the overall solution satisfies

the global constraints. The method has been used to successfully solved distributed

employee timetabling problems in large hospitals.

Li and Aickelin [145] describe an approach to nurse rostering using Bayesian op-

timisation. This novel approach to nurse rostering attempts to explicitly learn good

scheduling rules from a set of promising rules. A rule is used to schedule every nurse

in a problem and the method works by evolving good quality rule strings. A network

is generated with 4 nodes per nurse, representing the 4 different scheduling rules that

could be used to determine the nurses working pattern. The network is trained on a

set of promising solutions by counting the usages of each of the rules and calculating

the conditional probability for every node. The four scheduling rules are:

1. Random Rule: The nurse’s shift pattern for the week is selected at random.

Feasibility is not considered by this rule.

2. k-Cheapest Rule: The shift pattern is selected randomly from a list of the k -

cheapest patterns according to the fitness function. Feasibility is not considered

by this rule.

3. Cover Rule: The nurse’s shift pattern is constructed by assigning them to the

days and nights over the week with the highest number of uncovered shifts.

This rule choose from the set of feasible shift patterns that could be assigned

to the nurse.

4. Contribution Rule: This rule cycles through all the shift patterns which could

be assigned to the nurse and assigns each one a score according to a combination

of how well it covers uncovered shifts and the preference cost it entails for the

nurse. The best (feasible) shift pattern is then applied.

Feasibility is defined in terms of meeting cover requirements and ensuring cor-

rect working hours for the nurses. The fitness function is based on total preference

cost. An evolutionary approach is used to evolve good quality rule strings in an

iterative algorithm. The approach is used to generate weekly schedules for wards

with around 30 nurses. The authors found that the Bayesian approach rivals the

genetic algorithm and tabu search approaches developed previously [10, 84] for the

same problem. The Bayesian approach “mimics human behaviour more strongly than

a GA based scheduling system” and the authors hope to include more ‘human-like’

learning into scheduling algorithms in the future.

2.3.5 Decisions Support Systems

Whilst most successful employee rostering methods must have a large interactive

element there are some notable examples of methods that can be classified as decision

support systems. These methods are characterised by their ability to help users

(i.e. decision makers) to make operational decisions by providing highly interactive

tools for manipulating decision variables. They often provide a means by which to

test ‘what if’ scenerios and can supply alternative decisions and analyses of possible

consequences.

Smith et al. [214] developed a decision support system which allows users to

alter rosters by assigning and changing constraint weights. This feature enables users

to ‘tune’ the solutions produced according to their perception of the importance of

different objectives and staff preferences.

Khoong et al. [129] presented the ROMAN manpower rostering system, which was

designed to handle rostering problems in a wide variety of industries. This approach

subdivides the rostering tasks into long-term planning, rostering, and deployment

subtasks. The scheduling tasks are based around a set-covering algorithm which is

used to produce both cyclic and non-cyclic rosters, depending on the users’ require-

ments. The system allows manual changes to rosters through an interactive interface.

Abdelghany et al. [3] developed a decision support tool for ‘crew recovery’ in the

airline industry. The term ‘crew recovery’ describes the process of bringing staff from

one location, where they are not needed, to another location where they are needed,

in cases of disruption to services (e.g. due to adverse weather conditions). The system

uses a rolling approach to determine sequences of repair operations on the airline’s

schedule, to minimise the delays cause to flights. The user is given the option to test

scenarios involving different repair operations to determine their effectiveness.

2.3.6 Heuristic Methods

A heuristic is a ‘rule of thumb’ used to solve a problem and is defined by Reeves as

a method that produces a solution of an acceptable quality and incurred computa-

tional cost [197]. Heuristic solutions can never guarantee optimality in the way that

mathematical methods and some constraint programming methods can. In this re-

view the distinction will be drawn between constructive heuristics and meta-heuristic

approaches. Constructive heuristics provide very fast tools for generating solutions to

problems in a systematic fashion but require a relatively simple problem formulation

to be established. Meta-heuristic methods operate on existing solutions to try and

make them better and provide very flexible and adaptable tools. They have been

a very popular research area in recent years and have been implemented in several

successful commercial rostering products.

Constructive heuristics attempt to build or manipulate solutions by using simple

rules or procedures. These methods can in some cases be proven to solve problems to

optimality (or within certain acceptable bounds) using mathematical analysis (partic-

ularly for simple problems). For other approaches an empirical study of the success of

the method on example problems may suffice. Many heuristic approaches are highly

problem specific - they are designed to solve problems with particular characteristics

and are therefore often not suitable for more general application. Nevertheless the

low computational cost of many heuristic methods is very attractive and numerous

bespoke algorithms have been implemented for industrial and commercial problems.

These heuristic methods are also interesting for researchers as they can give informa-

tion about the success or failure of the methods that are often used during manual

human reasoning.

Smith [213] developed an interactive algorithm for developing cyclical nurse ros-

ters. This method helps the user to construct cyclical schedules by considering the

tradeoff between desired cover and temporal constraints. The temporal constraints

it considers are simplistic and no individual preference or request constraints are

allowed. Three shift types were scheduled - day, evening, and night shifts.

Burns and Carter [63] introduced a heuristic method for building cyclic rosters

and calculating workforce size for single shift problems. Their method allows for

variable demand and a number of simple temporal constraints (including weekend

constraints). Again, there is no provision for individually specifiable constraints due

to the cyclic nature of the rostering. The authors present a proof that their algorithm

is optimal in linear time. Narasimhan [180] developed an algorithm for generating

non-cyclical rosters for single shift problems. Their method assumes hierarchical

workforce organisation and computes the workforce size and optimal mix of skills as

well as generating the rosters.

Caprara et al. [67] developed a heuristic rostering method for crew rostering on

mass-transit systems. This method aims to minimise the number of crews whilst

satisfying a large number of union and contract based temporal constraints. A fuzzy

set theory approach to aircrew rostering has been developed by Teodorovic and Lucic

[227]. This method uses fuzzy reasoning to determine the strength of the decision

makers preference to assign a certain shift pattern to a certain crew. Vairaktarakis et

al. [233] present a method for scheduling trained workers in synchronous production

systems. This method is aimed towards increasing the productivity of the plant and

does not include notions of staff preference.

2.3.7 Meta-heuristic Search

Compared to constructive heuristics, meta-heuristics demonstrate relative indepen-

dence from specific characteristics of the problems they aim to solve [197]. The

meta-heuristics used for most scheduling problems are generally described as neigh-

bourhood search methods. These algorithms typically start with a solution in which

all the variables have been assigned values, and then attempts to find better solutions

by exploring different ways by which to change the variable assignments. In this sec-

tion meta-heuristic search methods for the employee rostering problem are grouped

according to the algorithmic approach used.

Tabu-Search

Tabu-Search methods employ a memory of previously visited solutions in order to

avoid getting trapped in local optima in the search space. This memory usually takes

the form of a list of previous solutions or previous ‘moves’ (defined as operations which

change the variable assignment of a solution). This technique can cause temporary

degradation in solution quality because it always accepts the best neighbouring solu-

tion (not on the tabu list) even if this solution is of worse quality than the current one.

The degrading effects can be limited by including aspiration criteria which specify the

maximum reversal in quality allowed. If no available neighbour meets these criteria

then a solution from the tabu list may be considered (this works particularly well for

methods which store move operations on the list).

Berrada et al. [43] developed a tabu-search alternative to the mathematical ap-

proach described earlier in this chapter. This search generated a set of feasible solu-

tions around the current feasible solution. The search then moved to the solution in

this set with the largest value according to the objective function unless that solution

is on the tabu list. The neighbours are generated by exchanging the positions of a

day off and a working day for each nurse. The rather simple tabu-search algorithm

generated by the authors was found to be less efficient at finding solutions than the

branch and bound method originally proposed - although they did acknowledge that

this was due to the experimental nature of the implementation.

Dowsland [84] and Dowsland and Thompson [83] describe a tabu-search method

for rostering nurses. They base their research on a problem in a ward of a major UK

hospital. Their method searches for shift patterns to assign to nurses from a set of

patterns which have been pre-evaluated according to the soft constraints. The soft

constraints considered are:

- individual nurses’ requests for specific shifts or days off.

- individual nurses’ preference to work days or nights, or to work a particular shift.

- days off together or separate.

- number of consecutive days on.

- rotating night or weekend work.

- roster fairness and balance.

- working history (in terms of the cost of the previous schedule).

The feasibility of rosters is determined by its satisfaction of hard constraints con-

cerning cover and skill mix requirements and strict temporal constraints pertaining to

the working hours of nursing staff. The approach used an branch and bound method

in order to determine if the combination of nurses and cover requirements could pos-

sibly lead to a feasible roster before the scheduling began. If it was not possible then

bank nurses were added.

The tabu search they proposed used a variety of different neighbourhood struc-

tures. It used an ‘oscillation’ approach to move between feasible and infeasible so-

lutions - thus increasing the range of solutions it could sample. The neighbourhood

chosen at any time depended on the feasibility status of the current solution and

the change of value of the fitness evaluation based on the soft constraints. When

stagnation was detected abrupt alterations were made to solutions in order to move

the search to a new region of the search space. Although the method developed by

the authors was designed for a particular problem the general nature of their solution

should lend itself well to other similarly formulated problems.

Vanden Berghe [235] wrote a PhD thesis describing her work on the commercial

nurse rostering product ANROM. This work is also described in the papers she co-

authored with Burke et al. [52, 53, 54, 55]. This thesis provides one of the most

comprehensive descriptions of the nurse rostering problem in the literature. ANROM

was designed to solve the wide variety of nurse rostering problems that can be found in

Belgian hospitals. It is highly configurable and allows users to specify a large number

of different constraints, and even to create their own using a variable counting sys-

tem. It attempts to provide maximum flexibility with regards to the description (i.e.

skills/qualifications) of staff members, the specification of shift requests, preferences,

and individual working contracts, and the design of shift types, planning periods, and

scheduling goals.

Vanden Berghe defines the following categories of soft constraints:

- Hospital Constraints : These are the general rules that the hospital imposes on

every ward. These include constraints dictating the minimum time between

assignments for individual nurses, and rules for the legal substitution of skill

categories (including upward and downward substitution - the author notes that

the upward substitution of staff is in fact more common in practice).

- Work Regulation Constraints : ANROM allows many different constraints to be de-

fined which describe the legal and preferable working patterns for all nurses,

subsets of nurses, and even individual nurses. These include standard temporal

constraints limiting, for example, the maximum number of assignments over a

period, the maximum/minimum number of consecutive assignments, the maxi-

mum/minimum number of consecutive free days, and the maximum/minimum

number of hours worked over a period. They include constraints limiting the

number of shifts of a particular type that can be worked over a period or in a

row. Weekend, bank holiday, and night shift constraints can also be defined.

Finally, constraints can be defined indicating penalty levels for particular shift

combinations and patterns of shifts.

- Personal Constraints : These are defined for specific nurses and include requests

for days off (which can be weighted individually depending on their severity),

requests for particular assignments, and tutorship relationships which specify

nurses who should be scheduled together if possible. It is also possible to define

pairs of individuals who should not work together.

All soft constraints are treated within the objective function through penalty

weights which can be manipulated by the user interactively. Solutions are evaluated

through a modular approach [53] which allows for maximum flexibility when spec-

ifying new soft constraints. This approach is designed to be efficient so that large

numbers of constraints can be modelled simultaneously without adversely affecting

the usability of the system.

The hard constraints define the cover and skill mix required on the ward over

varying time periods. ANROM includes a complex series of processes which checks

that rosters will be feasible with respect to the hard constraints before the actual

meta-heuristic search commences. These processes also perform an initial rough ros-

tering of the nurses to generate a feasible roster. One of the meta-heuristics used to

improve solutions is a hybrid tabu-search. This approach combines the tabu-list and

aspiration concepts with diversification strategies and ‘greedy shuffling’ - a technique

which models human scheduling behaviour. This greedy shuffling approach considers

all possible combinations of shuffle moves in the current solution and then performs

the best one. The author argues that this technique results in solutions which expert

human rosters find very hard to manually improve. A variable neighbourhood ap-

proach has also been implemented for the ANROM system [54]. ANROM is currently

being used in a large number of Belgian hospitals.

Burke et al. [57] use tabu-search as the basis for a hyper-heuristic nurse rostering

method. The authors attempt to develop generic algorithms which are not restricted

to one problem. Hyper-heuristic approaches add an extra control layer on top of

classical meta-heuristics. This layer chooses heuristics and meta-heuristics according

to the state of the solution and the ongoing performance of each heuristic. The

algorithm is tested on the same problem described by Aickelin [9] and Dowsland [84]

and the results are competitive. However, it is not clear from the experimentation

carried out thus far that the method is applicable to a wide range of problems.

Tabu search approaches have been used by many others in the literature. Nonobe

and Ibaraki [181] use tabu search approach for the constraint satisfaction problem.

This approach includes mechanisms for the automatic adjustment of the tabu tenure

(the length of the tabu list) according to the performance of the algorithm. Valouxis

and Housos [234] hybridise tabu search with an approximate integer linear program-

ming model to create a robust and efficient nurse rostering algorithm. A tabu search

algorithm for scheduling doctors in a hospital is described by White and White [248].

This algorithm focuses on the skills of doctors and the simultaneous rostering of the

medical students who work with them. A local search approach to designing good

shifts (in terms of their length and how they overlap) using tabu-like mechanisms is

presented by Musliu et al. [179].

Evolutionary Algorithms

Evolutionary algorithms are used to solve optimisation problems by imitating biolog-

ical models of natural evolution. Usually they are applied to scheduling problems as

genetic algorithms. These algorithms build populations of solutions and then com-

bine different parts of the best solutions to form new generations - thus mimicking

the processes of Darwinian evolution [31]. Local mutations can be applied to indi-

vidual solutions in order to diversify the search process. Memetic algorithms have

also been developed to solve scheduling problems. These algorithms generally consist

of a genetic algorithm combined with a local search approach [195]. In most of the

papers presented here a significant focus is placed on maintaining solution feasibility

through the design of complicated recombination and mutation operators.

Bailey et al. [24] developed a relatively simple genetic algorithm for the nurse

rostering problem. Simple constraints are defined, including cover requirements, skill

mix, and time between shifts. The objective function is a weighted sum of the staff

shortages, surpluses, and lack of continuity in the roster, and the goal is to min-

imise this objective. They use the rostering problem to demonstrate the ability of

meta-heuristics to escape local optima in the solution space. The authors intend

to investigate ways in which to incorporate staff preferences and requests into the

objective function.

Aickelin and Dowsland [9, 10] recognised that genetic algorithms are not good at

dealing with the conflicts between objectives and constraints. They used a classical

GA to solve the nurse rostering problem from a UK hospital described previously

in [84, 83] but found that it was unable to handle the constraints and produced

poor quality solutions. To improve the GA four different types of problem specific

knowledge were introduced:

- Co-operating sub-populations : A co-operative co-evolution strategy is used to breed

sub-populations representing solutions for each grade of nurses. A sub-fitness

function is introduced which measures the under-coverage for each nurse type

and is used to guide the evolution of the sub-populations.

- Incentives : Solutions are penalised if they are not balanced with respect to the

coverage of shifts over the period. This acts as an incentive to the GA to evolve

solutions with evenly spread violations of coverage constraints which will be

easier to improve in future.

- Repair : Improvement heuristics are used to cycle through shift patterns for each

nurse, accepting those which improve the solution fitness.

- Local-search; A hill-climbing algorithm is used to improve feasible solutions with

respect to nurse preference.

The authors found that the rosters produced by the improved GA approach were

as good as those produced by the tabu search approach developed by Dowsland [84]

when used for small time periods, but did not do as well over longer runs.

In a different GA approach by the same authors an indirect encoding strategy

was used [11] . The GA operated on permutations of the nurses. A heuristic-decoder

function was used to convert good-quality orderings of nurses into feasible rosters.

This approach was able to produce rosters which were as good as those produced

using the tabu-search approach.

Cai and Li [66] used a multi-criteria genetic algorithm approach to schedule staff

with mixed skills. This approach defines three objectives. The primary objective is

to minimise the staffing costs required to meet the skill mix and cover requirements.

The other two objectives aim to balance the staffing by aiming to maximise the

staff surplus over all the best solutions according to the primary objectives, and to

reduce the variation of the staff surplus over time. The selection of members of the

population for crossover and mutation is performed using a lexicographical ordering

of the solutions according to the three objectives. This method does not consider staff

preference or requests. It enables the modelling of different skill levels and includes

provision for skill substitution.

Jan et al. [121] developed a ‘population-less cooperative genetic algorithm’ ap-

proach to solve a multi-objective nurse rostering problem. In this method the nurses

are modelled as agents whose shift patterns are optimised according to temporal con-

straints and shift preferences. A reasonable number of soft constraints are defined

which measure the quality of nurses’ shift patterns. Simultaneously, the quality of

the whole schedule is optimised according to the average and standard deviation of

the quality of all the nurses shift patterns. The genetic algorithm operates on feasible

solutions - defined by the rosters satisfaction of cover and skill mix requirements. The

algorithm is tested on simulation data.

Burke et al. [55] present a memetic algorithm approach to the problem described

in [52, 53]. A number of variations to the basic memetic algorithm are described

- which is a combination of genetic algorithm and a steepest descent improvement

heuristic. The memetic algorithm is then combined with the tabu search approach

used previously in a hybrid algorithm. The authors report that the memetic algorithm

alone produced solutions which rival those of the tabu-search approach but takes a lot

longer to do so. The hybrid memetic/tabu approach produces solutions significantly

better than the tabu-search and it is recommended that this method be used when

schedulers do not need fast results.

Kawanaka et al. [125] developed a genetic algorithm approach for rostering nurses

of different skill classes. This approach defines two types of constraints, ‘absolute’ and

‘desirable’, which roughly translate to hard and soft. The absolute constraints include

minimum coverage and skill mix and temporal constraints regarding night shifts, free

days, and weekends. Desirable constraints are weighted and used to measure the

fitness of solutions. Staff requests, preferences, and some pattern definitions are used

to define these soft constraints. The crossover operations they use in the genetic

algorithm are allowed to generate infeasible offspring and a number of tools are used

to re-impose feasibility. The algorithm is compared to a GA approach in which

absolute constraints are included in the fitness measure. The authors argue that

inclusion in this way hinders the search process due to the steep peaks and troughs

it created in the fitness landscape. This hypothesis is validated by the experimental

results, although the results presented appear to pertain to only a single dataset.

Another genetic algorithm approach to nurse rostering is presented by Duenas

et al. [85]. This approach uses a multi-objective nurse rostering model incorporat-

ing individual nurse preferences and an emphasis on the role of the decision maker.

Ingolfsson et al. [118] describe a workforce scheduling algorithm which takes into

account the queueing effects that can be found in units with unpredictable demands.

A fuzzy genetic algorithm is presented by Li and Kwan [146] for scheduling drivers

in the public transport industry. This bi-objective approach formulates goals using

fuzzy sets to model the weighting of individual shifts which will then be chosen to

form the final schedule.

Simulated Annealing

Simulated annealing is based upon an analogy with the physical process of annealing.

In this process a solid material was initially heated and then cooled in order to improve

certain qualities. For optimisation problems this principle is used in order to escape

local optima in the solution space. A cooling schedule is represented by a formula

which describes the probability that inferior quality solutions can be accepted in the

search process. Hence at the beginning of the process the search has a higher ability

to escape local optima than later in the search when it is encouraged to find the

optimal solution in the current local region.

Brusco and Jacobs [51] present a simulated annealing algorithm for the tour

scheduling problem. In this approach long term working patterns (tours) are selected

from a list of feasible patterns. These tours consist of full-time tours and part-time

tours. The method is used to produce schedules for an organisation which operates

continuously with demand levels that fluctuates on an hourly basis. The algorithm at-

tempts to minimise the staffing costs while still meeting the demand requirements and

does not include any notions of staff preference or satisfaction. Different maximum

ratios of full time to part time staff are considered.

Bailey et al. [24] developed a simulated annealing approach alongside the genetic

algorithm approach previously described. They also compared the results achieved

with an integer programming formulation and a steepest descent algorithm. They

found that the simulated annealing algorithm produced solutions of comparable qual-

ity to the genetic algorithm but in much less time. They identified that the steepest

descent algorithm was the most scalable, followed by the simulated annealing and

genetic algorithms, and that the IP formulation was the least.

Other Approaches

Schaerf and Meisels [202] describe a ‘generalised local search’ approach to employee

timetabling problems. This approach considers many complex constraints and allows

employees to have one or more skills associated with them in an overlapping fashion.

The method allows individual employee requests. The research focuses on the ability

of local search algorithms to successfully navigate the search space. Neighbourhoods

are defined using three different operators: Replace, Insert, and Delete. These opera-

tors manipulate the assignment of particular task to employees. The neighbourhoods

are defined so that partial solutions can be accepted - these are solutions in which all

of the requirements (with respect to cover of tasks) are not satisfied. Violations of the

requirements and of the soft constraints contribute towards the cost function. Meisels

and Kaplansky [160] experiment with iterative restart techniques. These methods de-

tect stagnation in the search process and restart from a new random solution.

Aickelin and White [12] used statistical methods to compare rostering algorithms.

The comparison methods are then used to systematically evaluate alterations to algo-

rithms. In particular the problem of comparing algorithms which sometimes return

‘infeasible’ results is addressed. The result is a tool which can be used to identify

successful modifications to existing algorithms and thus improve them. The authors

report that the final, improved algorithm that they developed outperforms the genetic

algorithm developed previously [9, 10, 11].

Bellanti et al. [38] developed a greedy-based neighbourhood search to solve a

nurse rostering problem from a ward in an Italian hospital. In this problem the

constraints are very similar to those that will be described in this thesis although

the complexity of the problem is reduced by considering all nurses as being equally

skilled. The approach is multi-objective and a lexicographical approach is used to

order the following goals:

- minimise the deviation in the number of days each nurse has off per month from

the pre-defined ward average.

- minimise the shortages from required cover both morning and evening shifts for day

shifts.

- minimise the shortages from required cover for either shift for day shifts.

- minimise the shortages from required cover for night shifts.

- minimise the weighted sum of the soft constraint penalties.

The approach uses an ordering heuristic to generate initial solutions and then a

number of different strategies are used in order to escape local optima in the search

space. Partial solutions can be accepted during the search but are then repaired using

a greedy procedure.

De Causmaecker and Vanden Berghe [81] investigate the problem of relaxing cov-

erage constraints when feasible solutions can not be found to real-world rostering

instances. They found that by reducing requirements for individual wards better

solutions could be found, and the problem of users over-constraining problems can

be mitigated. Franses and Post [94] describe an automated personnel scheduling al-

gorithm for laboratories. This algorithm uses mathematical techniques with greedy

local search and has been implemented in the commercial product IPS.

2.4 Conclusion

This chapter has described the development of automated personnel rostering meth-

ods over the last 50 years. The characteristics which are fundamental to most ros-

tering problems were identified. In particular, attention was paid to the wide variety

of different formulations which can arise from real-world rostering problems. Many

papers have been written suggesting methods for generating staff schedules and some

of the key results have been presented in this chapter.

The early methods were developed using mathematical techniques such as integer

programming and usually focussed on satisfying a limited range of constraints and

objectives. These methods lacked flexibility and were unable to solve problems of a

realistic size, and consequently have not been used extensively in real-world settings.

Nevertheless, they introduced some of the key concepts which were combined and

improved in later methods. For example, the notions of staff preference and shift

requests introduced by Warner [240] have been used in the most successful modern

methods.

Constraint programming and meta-heuristic approaches have been used to suc-

cessfully solve real world problems. In particular, three nurse rostering systems stand

out as examples of both theoretically successful methodologies and viable commercial

products:

- ANROM [52, 53, 54, 55] is capable of modelling an exhaustive range of constraint

types, and even allows users to configure new constraints. The system utilizes a

number of different technologies centred around a hybrid tabu-search method.

- The hybrid tabu search and branch and bound method developed by Dowsland et

al. [84, 83] successfully solves real-world problems by searching through both

feasible and infeasible solutions. Through the use of pattern-assignments it

allows a large number of constraints to be specified.

- ORBIS Dienstplan [162, 164] uses a constraint reasoning approach which allows

many different constraints to be specified by the user in a very natural fashion.

A hyrbidisation of branch and bound and local search approaches is used to

find good quality solutions.

These methods are successful largely due to their ability to handle many different

user-configurable constraints. They also use mixed methodologies to solve problems

- taking advantage of the good characteristics of the available technologies to solve

key tasks. The method described in this thesis attempts to incorporate both of these

features by being both highly configurable and by combining different approaches (i.e.

CBR, tabu search, genetic/memetic algorithms).

Many other methods in the literature provide duplicated approaches and con-

tribute little to the advancement of the field beyond a validation of existing principles.

Such methods include the tabu search by Berrada et al. [43], the genetic algorithms

by Bailey et al. [24] and Cai and Li [66] and many other methods not cited in this

thesis. Often these methods fail to include concepts of staff satisfaction (particu-

larly with regards to preferences and requests) and other key constraint types. The

chief criticism, however, is that the authors do little to generalise their methodologies

beyond the particular problem instance they were attempting to solve.

Eliciting problem solving knowledge from personnel rostering experts is an issue

which is thusfar unsolved in the literature. The successful methods listed above

rely on weighting constraints or patterns, and prioritisation of constraint types for

capturing some of the knowledge about the qualities users would like to see in their

rosters, without providing a means to acquire information about how this should be

achieved. This thesis attempts to address this problem using a method which allows

users to specify how they would like problems to be solved.

Chapter 3

Case-Based Reasoning

3.1 Introduction

Case-Based Reasoning (CBR) [133] is an artificial intelligence (AI) methodology

which aim to solve new problems by using information about the solutions to previous

similar problems. It operates under the premise that similar problems require similar

solutions. A history of previous problem solving episodes, or cases, is stored in a

database (called a case-base). When a new experience is judged to contain informa-

tion which may aid future problem solving it is stored as a new case in the case-base,

thus increasing the knowledge it contains. This basic model provides a framework

for a vast array of different CBR approaches, some of which will be discussed in this

chapter.

Reasoning in artificial intelligence methodologies has traditionally been treated as

a process of sequentially applying rules in order to draw conclusions [141]. Rule-based

systems, for example, must establish a priori a set of relations which explicitly define

the behaviour of model elements. The development of CBR grew out of a realisation

that these systems lacked robustness and flexibility, were confined to narrow problem

domains, and were difficult to maintain and adapt over time [1]. The acquisition of

knowledge in the form of rules, particularly in complex real-world domains, can be

highly time consuming and often inaccurate [222].

CBR can address many of these issues because it does not call for an explicit

model of the problem domain [243]. CBR systems make use of specific information

3. case-based reasoning 62

about previous reasoning and so knowledge acquisition is reduced to the process

of recording previous experiences. Domain experts can interact directly with such

systems because they can provide domain knowledge by example [133]. This process

can be on-going and therefore provides both flexibility and adaptability. CBR systems

learn by memorising new knowledge as cases [243] and therefore improve over time by

filling any ‘gaps’ in their knowledge or by replacing experiences which are no longer

representative due to changes in domain structure.

The origins of CBR can be found in both cognitive science and artificial intelli-

gence. CBR can be considered as a model for human problem solving [141]. It is

believed that humans solve new problems by remembering how they solved previous

similar problems in the past [242]. Kolodner gives an instructive example of this

kind of human reasoning in [133]. She describes a host planning a meal for several

people including vegetarians, allergy sufferers, and those with particularly restrictive

tastes. In order to solve the problem the host thinks about meals that she has served

to her guests in the past. As part of this remembering process the host recalls who

she served the particular meals to and what problems arose (for example regarding

allergies). She also spends time thinking about how the meals served in the past

could be adapted to the makeup of her new group of guests. She finally chooses to

make the meal which she previously served to a group that is most similar to the new

group and to which adaptations can be made to resolve any outstanding issues.

Schank and Abelson conducted some of the initial research into the case-based rea-

soning model [204]. They defined the concept of a script as a structure used within

human conceptual memories to store information about stereotypical situations. On

the basis of these scripts human beings can reason and draw conclusions about situa-

tions in which they find themselves. This idea was extended by Schank in a dynamic

memory model [203] which emphasised the importance of indexing when using past

experience for understanding. They postulated that the processes of remembering,

understanding, experiencing, and learning are mutually inclusive. Human memory

is dynamic not only because we are constantly adding new experiences but also be-

cause these new experiences change the way we think about our older memories. As

a result, no two acts of ‘remembering’ are guaranteed to produce the same results.

It was generally recognised that similarity and metaphor [99], concepts and concep-

tual structure, and analogy are key to human reasoning. In cognitive science the

concepts which embody case-based reasoning generally come under a research area

called analogical reasoning [237].

An AI view of the case-based reasoning model was given by Porter and Bareiss

[191]. The Category and Exemplar Model works on the basis that natural concepts

in the real world should be defined ‘extensionally’ - their characteristics should de-

fined explicitly and enumerated to represent different instances. The case-memory

is defined as a network of categories, cases, and index pointers. Each case is associ-

ated with a category and features are weighted according to their importance. Three

types of indices are defined: pointers from features to cases or categories, pointers

from categories to all their associated cases, and difference links which describe how

a case differs from its neighbours in terms of the features. This networks is traversed

when solving problems and cases are retrieved for reasoning through knowledge-based

pattern matching which is highly dependent on the problem domain.

The early CBR literature described a framework for solving problems which could

be implemented using a variety of computational and information technological tools.

Watson [242] describes CBR as a methodology rather than a technology because it

gives a contractual description of what must be done without describing exactly how

it can be achieved. Technologies are chosen to implement the CBR methodology

according to their suitability to the problem being addressed. Examples include

nearest neighbour similarity measures, neural networks, fuzzy logic, rule-based rea-

soning and/or rule induction, graph theory, database languages such as SQL, and even

optimisation-type tools such as mathematical programming and meta-heuristics.

In this chapter some of the key issues and methological/technological approaches

to CBR will be described, with particular attention to those which are relevant to the

work described in this thesis. In Section 3.2 the basic CBR model will be presented.

Section 3.3 will provide reviews of the key research that has been carried out to date

into the scheduling applications of CBR.

3.2 Methodology and Research Issues

There have been a number of models suggested for case-based reasoning and in this

section the most important of these will be described. This subject field is vast and

the author will attempt to limit the descriptions to those models and details rele-

vant to the application of case-based reasoning in scheduling and planning problems.

Comprehensive descriptions of case-based reasoning and its history can be found in

the key text by Kolodner [133].

3.2.1 The Case-Based Reasoning Cycle

A single reasoning episode can be decomposed into a set of distinct phases. Kolodner

[133] distinguishes between two styles of CBR. In problem-solving CBR solutions are

proposed by retrieving previous experience and adapting it to solve new problems.

Problem-solving CBR is used to solve problems where solutions cannot be determined

in advance (e.g. nurse rostering). Interpretive CBR uses retrieved experiences to

justify conclusions about situations and requires the creation of a set of steps or

arguments needed to generate the desired solution. This style of CBR is used for

process planning in domains such as law where a desired outcome (e.g. to win the case)

is known and the steps need to get to this state must be determined. In either style

the retrieval/adaptation/justification process is followed by processes of criticism and

evaluation. If outcomes are not of sufficiently high quality or relevance then phases

can be repeated. This model of a single reasoning episode is presented graphically in

figure 3.1.

Many authors have described case-based reasoning in terms of a cyclic learning

model. One of the most cited is that of Aamodt and Plaza [2] and a pictorial repre-

sentation of the model is given in Figure 3.2.

The various tasks performed by CBR are often described as the ‘4 REs’: REtreive,

REuse, REvise, and RETain. This ‘R4’ [242] model can be summarised as follows:

- Retrieve: Find the most similar case(s) to the current problem in the case-base.

- Reuse: The solutions in the retrieved cases are then used to solve the new problem.

Figure 3.1: A reasoning episode according to Kolodner (1993)

- Revise: The suggested solutions are adapted to the context of the current problem.

- Retain: Any useful information or experience that can be gained from the current

problem solving instance is retained for future reasoning.

More recently an ‘R5’ model has been suggested with the insertion of a new first

step: Repartition [92]. In this approach the case-base is analysed off-line in order to

reorganise the case-base by ensuring that the relationships between problems and so-

lutions are partitioned correctly according to the case-matching technique employed.

Many authors recognise the need for off-line reorganisation of the case-base and a

large number of papers are dedicated to the subject of case-base maintenance.

The phases identified in the R4/5 model have been given different names by au-

thors throughout the literature. Allen describes the R4 steps as ‘retrieval’, ‘adap-

tation’, ‘validation’, and ‘update’ and adds an initial step called presentation [15].

In this step the current problem is identified and organised in such as way that it

becomes compatible with the contents and retrieval methods of the case-base. Voss

[238] describes the phase of problem decomposition as an optional initial step for

complex problems. The decomposition of problems into smaller sub-problems allows

Figure 3.2: The CBR cycle by Aamodt and Plaza (1994)

each to be solved separately using CBR (or other approaches). The use of an initial

decomposition step may make it necessary to include a composition step as a final

phase which combines all the solutions to the sub-problems into a single full solution.

Voss [238] categorises different CBR strategies as follows:

- single case adaptation: a similar case is retrieved from the case-base and adapted

to produce a solution.

- a priori decomposition with incremental adaptation: the problem is decomposed

into sub-problems which are solved sequentially and added to the final evolving

solution.

- a priori decomposition with simultaneous adaptation: the problem is decomposed

into sub-problems which are solved simultaneously. The resulting solutions are

composed into a final solution.

- incremental decomposition and adaptation: the problem is solved through a series

of decomposition-adaptation steps resulting in a solution which evolves over

- multi-case retrieval and simultaneous adaptation: a number of similar cases are

retrieved from the case-base and a solution is built using a single adaptive step.

Voss surveyed the strategies used by a number of well known CBR systems to

produce this categorisation. She concludes that for many complex problem domains

a CBR approach may need to hybridise the five basic strategies described.

Often the phases of reuse and revision of the R4/5 models are combined into one

of adaptation [15, 243], as they are in this thesis. There are a number of justifications

for this. The processes of finding the relevant cases in the case-base and then using

the knowledge contained in these can easily be separated (although in some methods

this is not the case). However, the phases of reusing solutions and adapting them are

intrinsically linked and it is difficult to identify their borders and interfaces. For the

remainder of this thesis they shall be treated as a single ‘adaptation’ phase.

3.2.2 Case Structure

Key to the success of any CBR system is the choice of information to store in cases.

Information must be relevant to the type of problem solving it will be used for. It

is argued by some researchers that the burden of knowledge acquisition which rule-

based methods suffer from is mirrored by the need for careful consideration of case

structure in CBR [76]. However, it is evident that in case-based reasoning only the

features of the problem need to be identified and not necessarily the relationships

between them [2].

Before describing how cases can be structured it is useful to define what a case

is in the abstract sense. A case is a unit encapsulating knowledge relevant to a

particular experience [243]. Typically cases contain a description or representation

of the problem, the solution that was used to address the problem, the outcome (i.e.

success, failure or a description of the outcome state of an experience), and sometimes

the context in which the case was generated (or the context in which it can be used

in the future). The solution to the problem can be represented in a number of ways.

In some systems solutions represent actual instantiations of the problem formulation

- and describe a state to which the objects in the problem must be moved to [25]. In

others, the solutions represent methods for solving the problem [237]. CBR systems

have also been designed to predict outcomes to particular scenarios.

In their simplest form cases are represented as vectors of attribute-value pairs. The

data types that can be stored in these can be numeric, symbolic, Boolean, or even

object-based. More complex domains, however, need more structured representations.

These can be used to encode relationships between the attributes or facilitate faster

and more accurate searching of the case-base. Cases may be classified according to

the following characteristics [2]:

- concrete/abstract : cases may represent concrete experiences or more generalised

episodes.

- complete/partial : cases may correspond to separate knowledge units or be dis-

tributed between sub-units. In some systems partial knowledge is stored when

knowledge elicitation is inconsistent or suffers from stochastic interference.

- isolated/related : cases may be defined totally independently of one another, or

they can be linked to each other in the form of a case hierarchy or relationship

network.

The choice of actual structural representation for cases depends heavily on the

problem domain. Cases should contain enough detail to allow future problems to be

solved. They must store sensible characteristics of the problems so that the system’s

performance is not degraded by the presence of erroneous, incomplete, or irrelevant

information.

The objects used to describe problems are commonly referred to as indices and

the process of assigning indices to cases is indexing [243]. There is much debate as

to the qualities which define good indices but in general they must (according to

[44, 105, 133]) be:

- predictive: they must describe the aspects of the problem which were taken into

account in producing the solution. This can include information about the final

outcomes of the solution (e.g. degrees of success/failure or state descriptions).

- abstract : information stored about problems should be as generally applicable to

possible future problems as possible. This increases the applicability of the

knowledge. It may require that additional salience indices are stored alongside

the descriptive indices to highlight the important aspects of each case.

- concrete: indices must not be too abstract or they will loose so much descriptive

content that future cases cannot be recognised as being similar without extensive

inference.

- useful : they must be relevant to the problem being solved.

Designers of CBR systems can use a variety of manual and automatic case indexing

algorithms. In general, manual methods focus on identifying the aims of the system

and the purpose of each case. Apart from vector feature selection algorithms, which

will be explored in Chapter 6, automated indexing methods are beyond the scope of

this thesis. Readers are directed to [133] for details of some frequently used methods.

3.2.3 Retrieval

The retrieval phase starts with a description of the current problem (sometimes re-

ferred to as the focus case) and returns either the most similar case in the case-base

[2], or a set of the most similar cases [133]. Crucial to this phase is the definition

of the level of similarity between cases. Many approaches have been developed and

depend heavily on the nature of the data stored in the cases.

Case-bases containing problems which can be represented with feature vectors use

a nearest neighbour (NN) similarity measure to determine the distance between cases.

Cases can be represented as points in a feature space where the Euclidian distance

between cases is inversely proportional to the similarity between them. One of the

most common nearest neighbour distance functions used for NN methods is:

∑ni=1 wisim(f I

i , fRi )∑n

i=1 wi

for cases represented by n features, where wi is the weight of feature i, sim is the

defined distance function for single features, and f Ii and fR

i are the values for feature

i in the input (focus) case and retrieved case respectively. The distance functions

used for each feature depend on the feature type. NN approaches will be discussed

in more detail in Chapter 6.

NN methods can be extended by retrieving a set of the nearest cases to the focus

case [247]. The size of this set is fixed to be a certain number k. These so called

k Nearest Neighbour (k-NN) methods often use a voting system to determine which

solution will be passed on to the adaptation phases.

Some CBR systems use induction algorithms for case retrieval. Induction al-

gorithms generate decision trees to organise cases in memory by determining those

features which best allow a classifier to discriminate between them [153]. Famous

examples include the ID family [194] and the AQ family [166] of algorithms. Machine

learning algorithms are used to induce rule-sets from the decision trees which are then

used for future retrieval. Rule-sets provide fast retrieval for CBR but may need to

be adjusted as more cases are added to the case-base. The induction process can be

time-consuming and therefore induction based CBR systems are more suited to less

dynamic domains.

Template retrieval is used in CBR systems designed for highly descriptive domains

involving linguistic or labelled indices. Cases are selected from the case-base using

SQL-like queries which specify required values for individual parameters [133]. This

approach is often used to trim the initial case-base before the application of a more

computationally intensive approach such as nearest neighbour is used [2, 199, 243].

Aamodt and Plaza [2] identify two distinct approaches to the definition of similar-

ity for case retrieval: syntactical and semantical. Methods which employ syntactical

similarity retrieve cases based on superficial, ‘knowledge poor’, comparisons of the

feature values. The advantage of these methods is that they can be used in domains

where explicit knowledge is difficult to elicit. Semantic similarity assessment makes

use of deep and complex domain knowledge to compare feature values - including

their relative weights. In general, methods which use semantic similarity matching

produce explanations of why two cases match.

3.2.4 Adaptation

Many authors believe that case reuse is the most difficult phase of CBR to describe

generally. Adaptation is problem-specific because it depends on how solutions can be

reused in the problem domain. However, there are some general patterns that can be

observed.

The most simple form of case reuse occurs in classification type CBR systems

where the ‘outcome’ of retrieval is usually the assignment of a class label to the

unclassified problem [87]. In these systems the differences between cases within the

same class are deemed unimportant and only the similarities (i.e. the class label) are

considered [2].

In more complex domains the adaptation phase must fit reused solutions into the

context of the new problems. In these systems adaptation focuses not only on the

parts of the solutions that can be reused for the new problem, but also on how the

differences between the past and current problems will affect this reuse [142]. Aamodt

and Plaza [2] identify two main ways that solutions can be reused. Transformational

(or structural [243]) reuse involves using the old solution as a basis for the new

solution. This is facilitated by transformational operators which use information

about the differences between cases to transform old solutions [105]. Other CBR

systems employ derivational reuse, where the method used to generate the old solution

is reused. In these systems cases may also store method parameters and operators,

subgoals considered, and failed search paths to guide the replay of the method for

the new problem [237].

Watson and Marir [243] categorise adaptation techniques in a more detailed way

as follows:

- null adaptation: simple case reuse (classification) [201].

- parameter adjustment : transformational reuse where the differences between prob-

lem parameters are used to modify the solution [25].

- critic-based adaptation: rules are generated which identify feature instantiations

which are not compatible because they violate domain constraints [77].

- reinstantiation: features of the old solution are instantiated with features that

must be present in the new solution because they are specified in the problem

description [104].

- derivational replay : derivational reuse as described above [148].

- model-guided repair : causal models of the problem domain are used to guide the

adaptation phase by repairing elements in the old solution which are infeasible

(or sub-optimal) in the new solution [225].

- case-based substitution: other cases are used to suggest appropriate adaptations

[106, 109, 175].

3.2.5 Maintenance

CBR systems must be maintained intelligently, systematically, and preferably auto-

matically in order to be successful in real-world applications [130]. The success of

CBR can be measured in a variety of different ways, depending both on the tech-

nologies used to implement the various phases, and on the domain it is applied in.

Most often maintenance in CBR refers to maintenance of the case-base in terms of

the cases it contains. It can, however, also refer to the revision of the case indices or

of the retrieval and adaptation phases.

Smyth [216] identifies two separate CBR maintenance issues. Efficiency directed

maintenance aims to minimise the computational costs associated with the storage

of case in the case-base and with the retrieval and adaptation phases. Competence

directed maintenance focuses on improving the quality of cases in the case-base and

of the reasoning outcomes.

Efforts at efficiency directed maintenance most commonly involve the identifica-

tion and subsequent removal of redundant cases from the case-base. Much research

has been carried in the broader knowledge-based reasoning community into the con-

cept of a saturation point in knowledge acquisition [93, 217]. This point is reached

when the addition of more knowledge into the knowledge-base (case-base) is deemed

to have an overall negative effect on the trade-off between the effectiveness of the

system and the processing cost. A number of methods which limit case-base size

according to a measurement of this tradeoff have been developed [126, 154, 170].

Measuring the competence of a case-base is more complex and a number of dif-

ferent metrics have been investigated [143]. The number, density, and distribution

of cases in the case-base all have an impact on quality. It has been observed that

measurements of size (i.e. the number of cases) do not provide accurate performance

guidance because they do not take into account the difference in effect of different

cases [217]. The density and distribution of cases in the case-base can be determined

using the similarity measure used during retrieval [218]. Cases which are ‘closer to-

gether’ in the problem space induced by the similarity measure more often produce

the same (or similar) outcomes. Some of the cases in these dense regions may be

redundant and in general cases in such regions have less influence on the reasoning

process than those in sparse regions [184]. The distribution of outcomes amongst

cases in the case-base can also be measured. Case-bases which contain dense regions

representing a large number of different solution outcomes are more likely to produce

inconsistent or erroneous results.

Smith and Keane [218] define coverage as a measure of case-base competence.

This measure determines the performance of individual cases with respect to the

CBR system’s retrieval and adaptation methods. It is defined, for each case, as the

set of problems contained in other cases which the case can solve with the knowledge

it contains. Smith and Keane use this measure in a case-base maintenance method

which targets the removal of cases who share coverage sets with other cases whilst

maintaining the overall set of problems that the whole case-base can solve.

Other research has been carried out into the transferal of knowledge contained

in cases to the adaptation phase. Shiu et al. [209] describe a fuzzy decision tree

approach which converts some of the cases in the case-base into fuzzy rules which are

used during adaptation on the new, smaller case-base. They found that this approach

could considerably reduce the size of a case-base with only minor losses in accuracy.

3.3 Case-Based Reasoning for Scheduling and Planning

Recent years have seen an acceleration of interest in developing CBR systems for

a variety of combinatorial optimisation problems including scheduling. The use of

intelligent systems, including CBR, has improved the applicability of many academic

scheduling methods to real world problems, and this trend is set to continue [134, 151].

The re-use of previous problem solving experience is a goal which has the potential to

greatly improve the speed and usefulness of automated scheduling methods, especially

for dynamic problems such as reactive scheduling [224].

CBR approaches to scheduling problems can be classified into three groups with

respect to the type of knowledge contained in the cases [190]:

(a) A case is used to construct a whole or partial solution to a new scheduling

problem. Therefore a case contains the description of the scheduling problem

and the full or partial solution to it. This approach was used in a number

of production scheduling systems [77, 144, 148]. It was also successfully used

in university course timetabling where each problem was represented by an

attribute graph which enables the representation of courses to be scheduled

and the relationships between them [58]. In order to enable the CBR system

to handle timetabling problems of large size, a multiple retrieval system was

developed which constructs a new timetable by iteratively using partial solutions

of the previous timetabling problems.

(b) A case suggests an algorithm for the new scheduling problem that worked well

for a similar problem in the past [136, 137]. For example, such an approach was

used in [205] where a case stored machine scheduling problems, while the cases in

the case base were organised in a transformation graph which showed relations

between cases. A vertex in the graph represented a group of scheduling problems

for which an appropriate algorithm was suggested. Instead of suggesting an

algorithm for the whole problem, in a CBR system for university examination

timetabling a heuristic which worked well in the previous similar situations was

selected at each step during the construction of a timetable [61]. These ‘replay’

type CBR systems have shown great promise for the problems they have been

defined for, although it is not clear that they would scale well to larger sets

of more complex problems. In particular the definition of ‘similarity’ between

large problems instances could potentially be problematic.

(c) A case describes a context in which a scheduling operator is used to repair/adapt

a schedule in order to improve its quality (in terms of constraint satisfaction).

Miyashita et al. [171] used the knowledge capturing capabilities of CBR to

elicit human experts’ preference for particular measures of roster quality when

making repairs to sub-optimal schedules. Operator re-use methods recognise

the importance of expert decision makers in the scheduling process and provide

the potential for effective integration of human and automated problem solving

techniques [150]. The research presented in this thesis is focused on the operator

reuse in personnel rostering.

Despite the existence of various CBR approaches to a wide range of scheduling

problems limited research has been conducted on the investigation of CBR approaches

to personnel rostering problems. Moreover, a majority of the algorithms developed so

far for personnel rostering problems tend to form the solution from scratch, starting

from an empty roster, rather than understanding a new problem in terms of old

experience. To the best of our knowledge Scott and Simpson were the only researchers

who combined CBR and constraint logic programming to solve simple nurse rostering

problems [208].

In the remainder of this section the use of CBR for scheduling and planning type

problems will be investigated. The scheduling research will be divided according to the

problem type: personnel rostering, educational timetabling, and industrial scheduling.

Much of the work in the literature has focused on the industrial scheduling problems,

particularly in dynamic ‘reactive’ environments. Personnel rostering and educational

timetabling problems have received little attention. In the last part of this section

other related CBR approaches will be discussed including those developed for planning

problems, the Travelling Salesman Problem, and air traffic control problems.

3.3.1 Personnel Rostering

Scott and Simpson [208] developed the only case-based approach to personnel ros-

tering found in the literature. They combined case-based reasoning and constraint

logic programming (CLP) to solve simple nurse rostering problems. A case-base of

commonly used weekly shift patterns is collected from manual rostering solutions.

These shift patterns are then allocated to nurses according to the number and type

of nurses in the problem being solved. They are used as a ‘draft solution’ which is

then improved with local search. The authors define the following steps:

1. Set up a Case-Base of sets of efficient shift patterns.

2. Analyse the current problem or partial problem situation and find the best case

match.

3. Allocate shift patterns from the selected case to nurses using an ordering rule.

4. Analyse shift allocation to find problems of cover.

5. If possible, fix problems of cover by legal swaps of shifts.

6. Store the solution as a generalised case if it is sufficiently different to existing

cases.

In subsequent research Scott et al. [207] investigated combining multiple case

retrieval with objective driven adaptation techniques to maximise result quality. They

have also applied these methods to the travelling salesman problem [206].

3.3.2 Educational Timetabling

Burke et al. [58, 193] developed a CBR approach for solving the problem of assigning

courses to a limited number of time-slots. Their approach was based on the definition

of structural cases. These cases represented previous problems as ‘attribute graphs’

rather than feature-value pairs. The attribute graphs consisted of nodes representing

events (courses) and edges representing relations between the events (constraints).

Each node and edge is given an attribute which corresponds to a descriptive label.

These labels define certain features about the object they are attributed to. For

example, courses can be labelled according to the number of times they must take

place each week. The edges (relations) between these courses can be labelled as being

consecutive or conflicting, according to the students that must take each course. Case

matching is defined as the graph isomorphism problem and solved using Messmer’s

algorithm which organises the graphs into a decision tree. New problems are then

compared to those in the case-base by navigating the decision tree and applying a

similarity measure which compares the attribute values. The retrieved solutions are

adapted using a graph heuristic method which attempts to minimise the violations of

constraints in the new solution. The authors found that their method could success-

fully solve course timetabling problems when the case-base is seeded with good quality

schedules. The method was particularly useful for problems which are solved on a

regular on-going basis because the structural similarities between successive problems

(and therefore solutions) could be successfully exploited.

Burke et al. [61] use CBR in an attempt to increase the generality of existing

university timetabling systems. They developed a tool for selecting heuristics for

solving course and examination timetabling problems. They memorise cases which

contain problem descriptions and the heuristics which worked well to solve them.

Problems are represented by simple features which describe different characteristics

including numbers of courses/exams, rooms, time-periods, students, and events which

cannot clash or be assigned consecutively. Other features are defined which are ratios

of the simple features. Knowledge discovery algorithms are used to select and weight

the features according to the performance of the heuristics on test problems.

3.3.3 Industrial Scheduling

Miyashita et al. [171, 172, 174, 173] developed the case-based scheduling system

CABINS (CAse-Based INteractive Scheduling). CABINS uses the knowledge captur-

ing capabilities of CBR to elicit human experts’ preference for particular measures

of roster quality when making repairs to sub-optimal schedules. A number of differ-

ent measures of solution quality can be established for scheduling problems including

tardiness, machine utilisation, inventory usage, setup times, make-span, and product

quality/yield. Experts prioritise these measures according to their individual goals.

CABINS learns to improve schedules using repairs which imitate the experts’ goals

without needing to establish complex objective functions a priori. The system oper-

ates in one of three different modes:

- Knowledge acquisition interactive mode to acquire user preferences and generate

the case base.

- Decision-support interactive mode where the previously acquired case base that

incorporates user preferences suggests revision actions and evaluation outcomes

to the user who can accept a suggestion or override it with a new suggestion.

- Automatic mode where previously acquired user preferences are re-used to guide

scheduling decisions without any interaction with the user.

CABINS stores cases which contain different objective measurements of the sched-

ule when repairs were made. These measurements are global (pertaining to the entire

schedule) and local (pertaining to individual jobs on individual machines). Each case

stores a repair history which gives a series of repair tactics that were used to improve

the schedule. Each tactic used is assessed to determine the outcome (did it solve the

problem) and the overall effect on the schedule.

The system operated using the following steps:

- A focal job in the initial suboptimal schedule is randomly identified to be re-

paired.

- Activities in a focal-job are repaired in a forward fashion starting with the

earliest activity of that job that has enough upstream slack.

- A repair strategy is selected for the current problem using CBR.

- After a repair has been executed, CBR is used to predict and evaluate the repair

outcome.

- If repair is deemed a success, find the next activity of the focal-job to repair,

else (if repair outcome is a failure), CBR is invoked to select the next repair

tactic to repair the current focal-activity.

The authors conclude that CBR is an excellent tool for modelling decision making

in complex ill-defined domains such as scheduling. CABINS can successfully elicit

domain knowledge by learning from human experts and apply it iteratively to improve

schedule quality.

MacCarthy and Jou [148] developed a CBR expert system for scheduling problems

with sequence dependent setup times. Simulation is used to generate cases which

consist of schedules which may or may not satisfy all the constraints. Cases are

retrieved based on the existence of conflicts between the constraints and schedule

performance measures. The retrieved cases contain heuristics and algorithms used to

solve problems which are then adapted to the current problem. MacCarthy and Jou

[149] later describe a more general framework for CBR in scheduling. They describe

the elements needed to successfully apply CBR to scheduling problems. In particular,

they emphasise the importance of case design in the development of CBR scheduling

systems.

Szelke and Markus [225] describe a case-based reasoning approach for reactive

scheduling. This method performs repairs to schedules when they are damaged by

external factors. The automated scheduler SUPREACT uses a multi-agent approach

whereby individual agents solve sub-problems in order to work towards common ob-

jectives. The agents perform case-based, rule-based, or model-based reasoning and

take responsibility for key aspects of the re-scheduling. The CBR agents are responsi-

ble for tuning the behaviour of the system according to past performance. Over-time

they choose search heuristics to decrease average response time and increase solution

quality. They use statistics about the current schedule, the ordering of jobs, resource

status, machine availability, and operators’ rosters to make decisions.

Cunningham and Smyth [77] introduce a case-based approach to scheduling by

reusing previous good quality solution components. They use their system to solve a

travelling salesman problem and a single-machine scheduling problem with sequence

dependent setup times. Two separate CBR solutions are proposed. The first retrieves

cases from the case-base based on information about the jobs being scheduled and

then uses the solution stored to produce a ‘skeletal’ solution which is then completed

using a separate algorithm. The second approach retrieves cases repeatedly and uses

the retrieved solutions to gradually build the final schedule. Sub-sections of the

problem are identified and compared to the cases in the case-base according to the

jobs. Adaptation methods remove jobs which violate constraints and add them back

to a pool of unscheduled jobs. The authors conclude that the methods can successfully

solve single machine problems but they note that the adaptation techniques are very

problem-specific.

Schmidt [205] used case-based reasoning in a decision support system for produc-

tion scheduling problems. They used problem classification information from schedul-

ing theory to describe problems stored in cases. For each problem a corresponding

best performing solution strategy was also stored in the case. This simple approach

was presented as an interactive tool which could suggest solution techniques to the

decision maker.

Xia and Rao [252] presented a case-based reasoning system for process operation

support. They introduce the concept of dynamic case-based reasoning (DCBR) a

more realistic model for domains involving complex systems and fault propagation

phenomena. Their approach uses time-dependent indices, dynamic feature selection,

and multiple case retrieval strategies. Of particular interest are the time-dependent

features which represent changes in values over time (e.g. changes in temperature).

The system recognised the symptoms of potential faults and advised production man-

agers of appropriate corrective actions. The author’s found that the DCBR model

could accurately predict system faults and significantly improve the performance of

a paper production system.

Louis and McDonnell [147] developed a hybrid CBR and Genetic Algorithm (GA)

approach for optimisation problems and tested it using examples of job shop schedul-

ing, military asset allocation, and circuit-board design problems. Their approach uses

case-based reasoning to store promising solutions which are then periodically injected

into a genetic algorithm’s population. They argue that CBR is good at memory or-

ganisation and the GA is good at solution adaptation. They found that case-injected

genetic algorithms are able to find better solutions than GAs alone, and are able to

converge on good solutions over a smaller number of generations.

3.3.4 Other Problems

Veloso [236] present the PRODIGY/ANALOGY system which is used to solve pro-

cess planning problems. The system integrates automatic case generation, retrieval,

storage, adaptation with more general planning methods. PRODIGY/ANALOGY

does not attempt to reuse solution components when they are retrieved, but instead

relies on reconstructive methods to interpret how the previous solutions were con-

structed. In later work Stone and Veloso [223] explore the use of user-knowledge

in an interactive version of PRODIGY. This system allows users to interrupt the

planning processes during their execution and change the behaviour of the system.

Bergmann and Wilke [42] describe the PARIS case-based reasoning system for

“Plan Abstraction and Refinement in an Integrated System”. They formalise the ab-

straction of cases from their domain-specific (or concrete) descriptions with a complete

mathematical model. The PARIS system learns abstract planning cases automatically

from sets of concrete training cases.

The PRODIGY/ANALOGY and PARIS systems are discussed in more detail in

Bergmann et al. [40] and Bergmann et al. [41] along with other CBR planning

systems CAPLAN/CBC [177] and ABALONE. The authors show that there is a

relationship between the workload required to solve a problem and the modification

that is required to adapt a retrieved case.

Cunningham et al. [78] developed a CBR system for solving travelling salesman

type problems. They found that CBR could produce solutions of average quality

very quickly but had more difficulty producing good quality solutions no matter how

much time it was given. They determined that solution quality is lost during case

adaptation. They conclude that efforts to use CBR for optimisation problems must

ensure that solution quality is maintained in adapted solutions.

Changchien and Lin [69] describe an interactive CBR system for marketing plan-

ning. This method stores cases using the meta-language XML and retrieves cases

using a combination of XML parsing the Analytic Hierarchy Process. Adaptation

is performed by identifying core elements of similarity and filling in the gaps iden-

tified by Multi-Attribute Gap Analysis diagrams. They found that the combination

of CBR with multi-attribute decision making techniques facilitated faster marketing

planning.

3.4 Conclusion

This chapter has introduced some of the key research areas in the field of case-based

reasoning. The issues of case structure design, retrieval and adaptation algorithms,

and case-base maintenance have been discussed. CBR is defined as a framework

for solving problems which can be implemented using many different computational

tools.

The use of CBR for personnel rostering problems is virtually unexplored in the

literature. A single approach is cited, that of Scott and Simpson [208], which attempts

to solve a simple nurse rostering problem. Their research focusses on how CBR can

be used as a domain reduction technique to enhance constraint logic programming.

The authors draw positive conclusions about the success of such a hybridisation but

provide little insight into the applicability of CBR to rostering problems of real-world

complexity. It is possible that the method could be enhanced to solve larger problems

but there is no evidence that this has been explored by the authors to date.

In this chapter the application of CBR to the wider field of scheduling and plan-

ning was divided into three distinct classes, based on the type of information stored

within each case in the case-base. In the first class, cases store whole or partial sched-

ules which are use to construct new solutions to problems or sub-problems deemed to

be similar. Such methods are particularly suited to problems which are composed of

elements which change little between instances. One of the most successful examples

of this approach is given by Burke et al. [58] for university course timetabling prob-

lems. These problems display a high degree of similarity between instances - a large

percentage of the the courses offered in one year will be offered again in the next.

Consequently it is practical to use those parts of old solutions which will apply to a

new problem with minimal adaptation. Scott and Simpson’s nurse rostering method

is also an example of the use of partial solutions (i.e. shift patterns) to solve new

problems.

Methods in the second class store cases containing algorithms for solving problems

with common characteristics based on their past good performance. This approach

was used successfully for machine scheduling problems by Schmidt [205] through

the exploitation of certain common structural characteristics. Some good results

have been demonstrated by methods of this class. However, this author remains

unconvinced that meaningful measures of problem similarity can be defined for entire

problems of reasonable complexity.

The final class of methods store examples of repair or construction operators used

to adapt a solution. These methods tend to manipulate individual objects (e.g. jobs,

employees, time slots, shifts, etc.) rather than representations of entire or partial

solutions. The best example of this class is CABINS [171, 172, 174, 173] - indeed

CABINS is undoubtedly the most comprehensively researched example of an appli-

cation of CBR to scheduling problems in the literature. CABINS can be used both

interactively, in order to increase the amount of experience in the case-base, or as

an automatic tool. It operates on individual job, attempting to improve the over-

all quality of the schedule through repair strategies generated from past scheduling

experience. The concepts of individual repair operations and interactive training in-

troduced by CABINS are used in the personnel rostering method described in this

thesis.

Research into the use of CBR for scheduling problems is relatively recent but

has produced some successful and promising methods. It is surprising that CBR is

not used more often for scheduling problems - particularly for solving problems with

a large degree of repetition of certain characteristics between successive instances.

Apart from one simple approach, the use of CBR for personnel rostering is unexplored.

This thesis attempts to address this and apply many of the lessons learnt from the

CBR approaches to machine scheduling and planning problems to the problem of

personnel rostering.

Part II

The CABAROST Model

Chapter 4

A Nurse Rostering Model

4.1 Introduction

The novel problem formulation that will be introduced in this chapter is based on

the author’s analysis of the complex nurse rostering problem in the Ophthalmological

ward at the Queens Medical Centre University Hospital NHS Trust (referred to herein

as the QMC ward). This problem provided the primary motivation for the methods

that will be described in this thesis and will be used as the main source for illustrative

examples and experimental data. However, the formulation and algorithms have been

designed to be both flexible and expandable by considering the implications for other

problems described in the literature. In particular, it is reasonable to conclude that

the methods described here are generally applicable to a large number of the rostering

problems that might be found in UK hospitals.

Throughout this thesis many of the practical aspects of nurse rostering will be

discussed in terms of the rostering decisions that may be made. This terminology is

used to describe scheduling and re-scheduling actions that an expert decision maker

(i.e. the ward managers) makes when they are producing rosters. Particular attention

is paid to the decisions that are made when experts repair damaged areas of the roster

such as violations of nurse preferences or constraints.

The QMC ward consists of between 30 and 35 nurses and cover is required on a

24 hour basis. By NHS standards it is considered to be a medium sized ward with

high demand predictability and therefore reasonably constant staffing requirements.

4. a nurse rostering model 87

Typically, rosters are produced over periods of 28 days and are described by the

months in which they fall. The rosters are non-cyclical in nature and do not adhere

to pre-established working patterns.

A number of different factors are considered when making rostering decisions. The

nurses in the ward are described using several different categorisations. Qualifications,

experience, and specific specialty training all affect the supervisory roles that nurses

may take. Factors such as gender, international status, and even personality have

bearing on the decisions that are made. The context in which decisions are made

determine the influence these factors may have. For example, decisions made during

tightly constrained periods may ignore factors which would otherwise be considered.

Manual roster production in the QMC ward is a three stage process involving

all nurses. The self rostering planning approach is used to give employees greater

involvement in the rostering process. This approach provides an efficient means by

which staff can indicate detailed preference information. It recognises that nurses are

professionals who will fulfill their responsibilities without excessive administrative

direction. The approach is becoming more popular throughout hospitals in the UK.

A comprehensive survey of the use of this approach in NHS hospitals can be found

in [210].

The three stages are:

1. Nurses are assigned to teams (according to a particular skill mix).

2. Nurses produce partial rosters (called preference rosters) for the planning period

in consultation with other members of their teams.

3. Partial rosters are combined to produce the ward roster. Constraint violations

in the ward roster are then repaired by senior staff members.

Preference rosters represent individual nurse’s requests to work particular shifts

on particular days. If they have no preference on a particular day then they can leave

it blank. Preference rosters vary considerably between nurses with regards to the

amount of detail included and individual flexibility. The third stage is the most time

consuming in the process and burdens a senior nurse with numerous hours of extra

work every month. The constraint violations present in the roster must be repaired

whilst maintaining as much information from the preference rosters as possible. In

some extreme circumstances, when preference information has been severely damaged,

certain constraint violations can be ignored (at the discretion of the senior nurse).

In this chapter the nurse rostering problem will be formulated. In Section 4.2 the

basic problem variables will be introduced including the nurses and the shifts and

shift preferences they may be assigned. The representation of constraint information

will be described in Section 4.3 along with the data structures used to describe specific

violations of constraints. In Section 4.4 the repairs that may be used to make changes

to rosters are described. Section 4.5 introduces the concept of problem and solution

spaces as a framework for the methods described in later chapters. The data-sets from

the QMC ward are described in Section 4.6 and some sample rosters are discussed.

4.2 Nurses and Shifts

The nurse rostering problem is represented by the ordered pair

R = 〈N,C〉 (4.1)

N = {nursei : 0 ≤ i < I} (4.2)

is the set of I nurses to be rostered, and

C = {φk : 0 ≤ k < K} (4.3)

is the set of K constraints. The set N contains information about the nurses to be

rostered, the shifts they have been assigned and the shifts that they would prefer to

work over the rostering period. The set C imposes constraints on the shift assignments

in N .

Each nurse is denoted by a 4-tuple,

nursei = 〈NurseTypei, hoursi, NRi, NPi〉 (4.4)

where NurseTypei is an array of descriptive information about nursei, and hoursi are

the maximum number of hours that nursei should work according to their contract

(for example this is 37.5 for full time nurses in the QMC). The set of assignment

variables, NRi, is defined:

NRi = {si,j : 0 ≤ j < p}, 0 ≤ i < I (4.5)

where si,j are the shift assignments of nursei on days j over the rostering period of

length p (usually 28 days). Similarly, the set of preferred assignments, NPi, is defined:

NPi = {〈pi,j, hi,j〉 : 0 ≤ j < p}, 0 ≤ i < I (4.6)

where pi,j are the shifts that nursei would prefer to work on days j over the rostering

period, and the hi,j are Boolean variables indicating whether the corresponding pref-

erence is a hard request or not (in which case it is called a soft request). Hard and

soft requests are treated differently during the rostering process and it is expected

that hard requests will only be violated in exceptional circumstances. NPi will be

referred to as the preference roster of nursei.

The shifts that can be assigned to the si,j and pi,j variables are:

UNASSIGNED: (U) No shift is currently assigned and the nurse has no request.

OFF: (O) No shift is currently assigned and the nurse has requested that they do

not work.

EARLY: (E) The early shift (0700-1445).

LATE: (L) The late shift (1330-2115).

NIGHT: (N) The night shift (2100-0715).

The descriptive information about a nurse is stored in the NurseType structure.

As well as being used to describe individual nurses, this data structure is also used

Table 4.1: The fields of NurseType

Field Domain Descriptionqual {RN,EN, AN, SN,QN, PN, XN} the qualification level of the nursegend {M, F} the gender of the nurse - (M)ale

or (F)emaleintl {I,H} the nationality of the nurse -

(I)nternational or (H)ometrain {ET, NT} the specialty training of the nurse

- eye-trained (ET) or non-eye-trained (NT)

grade {A,B,C,D, E, F,G, H, I} the grade of the nurse accordingto the NHS grading system

to describe nurse requirements for the definitions of constraints. The structure is

represented as an 5-tuple of fields as follows:

NurseTypeγ = 〈qualγ, gendγ, intlγ, trainγ, gradeγ〉, γ ∈ Γ (4.7)

where the fields are defined in Table 4.1.

Nurses in the QMC ward belong to one of four possible qualification categories.

These are, in descending order of seniority: registered (RN), enrolled (EN), auxil-

iary (AN), and student (SN). Registered nurses are the most qualified and have had

extensive training in both the practical and managerial aspects of nursing, whereas

enrolled nurses have received mainly practical training. Auxiliary nurses are unquali-

fied nurses who can perform basic duties and student nurses are training to be either

registered or enrolled. These four nurse qualifications are classified hierarchically by

using three additional qualifications. RNs and ENs are classified as qualified (QN)

while QNs and ANs, are both employed (PN). Finally, XN denotes nurses of any qual-

ification level. Figure 4.1 shows the relationships between the different qualification

categories. The bold text indicates the four real qualifications.

Registered and enrolled nurses can receive additional training specific to the ward

that they work in. In the ophthalmology ward these nurses receive eye-training,

denoted ET (a nurse without eye-training is called non-eye-trained denoted NT).

Figure 4.1: Nurse hierarchy

The specialty training allows nurses to perform certain procedures and to supervise

staff who do not have the training. A certain minimum number of specialty trained

nurses are required to be on the ward at all times.

The grade of each nurse is determined by a National Health Service (NHS) stan-

dard which is based on the qualification level of the nurse and the amount of hospital

experience that they have. In NHS hospitals grades can range from ‘A’ (most junior)

through to ‘I’ (most senior) although in practice only a subset are used. The grading

system establishes supervisory relationships between nurses. All nurses of grade D

and below who do not have specialty training must be supervised by nurses of grade

E or above. In extreme circumstances eye-trained D grade nurses may supervise but

this is discouraged due to litigation risk.

The gender and nationality of a nurse are also factors that affect rostering deci-

sions. NHS hospitals are recruiting significant numbers of international nurses due

to the fall in the number of home-trained nurses. It is unofficial policy that these

international nurses are spread evenly between shifts so that they are adequately su-

pervised by UK staff. The gender of nurses is important particularly when dealing

with older patients, many of whom are not comfortable with being cared for by mem-

bers of the opposite sex. Again, this is dealt with by ensuring a balance of staff of

each gender are on the ward at all times.

4.3 Constraints and Constraint Violations

In this section constraints are defined for the nurse rostering problem. They are

deliberately general and flexible with the intention that decision makers can choose

and adapt the constraints that are appropriate for their individual rostering problems.

Each constraint type has a number of parameters which can be set. For all of the

constraints the decision maker can specify which nurses are affected according to the

nurse type information defined in the previous section.

In the literature constraints are often divided into two categories, hard and soft

[53]. Hard constraints are those that must not be violated. A roster is considered to

be feasible if it violates none of the hard constraints. Soft constraints are considered

to be more flexible and do not have to be satisfied. They are often used to describe

roster quality by adding a penalty to an objective function proportional to the degree

of their violation. In this thesis the difference between hard and soft constraints is not

defined explicitly. It will be shown later that the treatment of constraint violations

is captured implicitly within the experience stored in the case-base.

This decision reflects the reality of real-world rostering problems where the bor-

derline between hard and soft is difficult to discern. In fact the treatment can depend

heavily on the context in which constraint violations take place. Constraints which

would normally be considered hard, such as cover or work hours constraints, can be

relaxed in extreme circumstances. The rigid subdivision of constraints into the two

categories fails to provide a flexible enough model for most real world problems.

Each constraint φk in C is defined as a function which operates on a structure

containing information about the constraint, Constraintk, and the set of nurses, N :

φk(Constraintk, N) = {violationkm : 0 ≤ m < Mk}, 0 ≤ k < K (4.8)

where the violationkm are the Mk violations of constraint φk in the current roster.

Constraints may be one of twelve different types, and different parameters are used

for each type. In general, each Constraintk is a 3-tuple:

Constraintk = 〈cTypek, NurseTypek, cParamSetk〉, 0 ≤ k < K (4.9)

where cTypek is the type of constraint, NurseTypek describes the nurses whose shift

assignments will be restricted by the constraint, and cParamSetk is a set of param-

eters specific to the constraint type.

The twelve constraint types defined are:

Cover constraints define the minimum number of nurses of a particular description

(represented by the NurseType structure used to describe nurses) that must be

assigned to a particular shift. It is normal that large number of cover constraints

are defined for rostering problems. A set of cover constraints imposed on a single

shift together define the skill mix that must be present on the ward for that

shift. The flexibility provided by the use of the NurseType structure allows

for overlapping constraints which are often used in real-world problems. For

example, it may be the case that 4 qualified nurses and 1 registered nurse are

required for the early shift and this would be represented as two constraints.

The second constraint is satisfied if one of the 4 qualified nurses is an RN, but

if this is not the case then an RN must be assigned to the shift in addition to

the 4 nurses already assigned.

HardRequest constraints are violated if a nurse’s preferred shift is different from

their assigned shift, and the preferred shift is a hard request. It is assumed that

HardRequest constraints will represent shift preferences which should only be

violated in extreme circumstances. They can also be used to represent the pre-

ferred working patterns of nurses with strictly specified contracts. For example

many part time nurses only work early shifts on specified days so that they can

collect their children after school.

MaxDaysOn constraints limit the number of days that nurses may work in a row.

For the QMC ward this is generally 6 for all nurses. This constraint is often

violated by nurses’ shift requests and it is up to individual ward managers to

decide in which circumstances this is acceptable.

MaxHours constraints set the maximum number of hours a nurse may work over

a period. This can differ considerably between nurses and depends on their

individual working contracts. Full time nurses at the QMC ward may not work

more than 75 hours in a fortnight. Specifying limits over a fortnight allows for

some flexibility in shift assignments in individual weeks. Nurses who work more

hours during one week will be compensated with extra time off in the following

MinDaysOn constraints define the minimum number of days that nurses may work

successively. This is generally set at 2 days for full-time nurses. It is not used for

most part-time nurses because of the smaller number of shifts that are assigned

each week.

MinHours constraints set the minimum number of hours a nurse may work over a

period. This constraint is violated if nurses’ time is under-utilised by the roster.

SingleNight constraints are violated if nurses are assigned individual night shifts.

Nurses at the QMC ward prefer to work night shifts in blocks of two or more.

Again, this applies mainly to full time nurses - it is often the case that part

time nurses will work a single night shift in isolation.

SoftRequest constraints are violated if a nurse’s preferred shift is different from

their assigned shift and the preferred shift is a soft request. During the self-

rostering process nurses may provide very detailed sequences of preferred shifts.

These preferences however are not considered to be binding and they are often

violated during the repair of the roster.

Succession constraints define illegal shift combinations for nurses. It is not desirable

to work shifts of one type on one day followed by shifts of another type on the

day after. For example, a NIGHT shift followed by an EARLY shift is one

such combination.

WeekendBalance constraints set the number of weekends that nurses may work

over a period. They state the maximum number of weekends a nurse may work

in any given number of weekends. For example, nurses in the QMC ward may

not work more than 2 weekends out of every 4, unless it is stipulated in their

contracts.

WeekendsInARow constraints set the maximum number of weekends that nurses

may work in a row. This is usually three in the QMC ward. This constraint

is often covered by the WeekendBalance requirement and may only be used for

certain nurses.

WeekendSplit constraints are violated if a nurse works only one of the days in a

weekend. This constraint encourages weekends to be used in full and makes

adhering to WeekendBalance and WeekendsInARow constraints more likely.

The parameter set associated with each constraint varies in size and content de-

pending on the constraint type:

Cover: cParamSet = {shift, min} where shift ∈ {E,L,N} and min ∈ Z+. There

must be at least min nurses working the indicated shift;

HardRequest: cParamSet = ∅. No additional parameters are used to describe

these constraints;

MaxDaysOn: cParamSet = {max} where max ∈ Z+. Nurses may work no more

than max days in a row;

MaxHours: cParamSet = {max, period} where max ∈ R+ and period ∈ [0, p− 1].

Nurses may work no more than max hours in any stretch of period days;

MinDaysOn: cParamSet = {min} where min ∈ Z+. Nurses may work no less

than min days in a row;

MinHours: cParamSet = {min, period} where min ∈ R+ and period ∈ [0, p − 1].

Nurses may work no less than min hours in any stretch of period days;

SingleNight: cParamSet = ∅. No additional parameters are used to describe these

constraints;

SoftRequest: cParamSet = ∅. No additional parameters are used to describe these

constraints;

Succession: cParamSet = {shift1, shift2} where shift1, shift2 ∈ {U,E,L,N}.Nurses may not work shift1 followed immediately by shift2;

WeekendBalance: cParamSet = {max, num} where max, num ∈ Z+. Nurses may

not work more than max weekends in every num weekends;

WeekendsInARow: cParamSet = {max} where max ∈ Z+. Nurses may work no

more than max weekends in a row;

WeekendSplit: cParamSet = ∅. No additional parameters are used to describe

these constraints.

For an example of the instantiation of a Constraint structure consider a constraint

which restricts the maximum number of consecutive days on which any nurse may

work to six:

〈MaxDaysOn, 〈XN, ∗, ∗, ∗, ∗〉, {6}〉 (4.10)

where the wildcard symbol * in the NurseType structure indicates that the field may

take any value.

A distinction is made between the constraints imposed upon a particular rostering

problem and the violations of those constraints. Here constraints are defined as

functions which generate information about the parts of the roster in which they are

not satisfied. This information is represented by violation structures which describe

specific occurrences of constraint violations in the roster.

When constraint function φk of type cTypek is applied to N it produces violations

of the form:

violationkm = 〈cTypek, vParamSetkm〉, 0 ≤ km < Mk (4.11)

where vParamSetkm is a set of parameters describing the violation. The parameter

sets for violations differ according to the type of constraint they were generated by.

The parameters for violations of the same three constraints described above are given

Cover: vParamSet = {NurseType, day, shift} where NurseType is taken from the

constraint which generated the violation, day ∈ [0, p−1] and shift ∈ {E,L,N}.There are insufficient nurses of type NurseType assigned to shift on day.

HardRequest: vParamSet = {day, nurse} where day ∈ [0, p− 1] and nurse ∈ N .

The hard request of nurse on day has been violated.

MaxDaysOn: vParamSet = {startDay, numDays, nurse} where startday ∈ [0, p−1], numDays ∈ [1, p − startDay], and nurse ∈ N . Here nurse is working too

many shifts in a row starting on startDay for numDays days.

MaxHours: vParamSet = {startDay, numDays, nurse} where startday ∈ [0, p −1], numDays ∈ [1, p − startDay], and nurse ∈ N . Here nurse is working too

many hours starting from startDay for numDays days.

MinDaysOn: vParamSet = {startDay, numDays, nurse} where startday ∈ [0, p−1], numDays ∈ [1, p − startDay], and nurse ∈ N . Here nurse is working too

few shifts in a row starting from startDay for numDays days.

MinHours: vParamSet = {startDay, numDays, nurse} where startday ∈ [0, p −1], numDays ∈ [1, p − startDay], and nurse ∈ N . Here nurse is working too

few hours starting from startDay for numDays days.

SingleNight: vParamSet = {day, nurse} where day ∈ [0, p − 1] and nurse ∈ N .

Here nurse is working a single night shift on day.

SoftRequest: vParamSet = {day, nurse} where day ∈ [0, p − 1] and nurse ∈ N .

The soft request of nurse on day has been violated.

Succession: vParamSet = {day, nurse} where day ∈ [0, p − 2] and nurse ∈ N .

Here nurse has an illegal pair of shifts on days day and day + 1.

WeekendBalance: vParamSet = {day, numWeekends, nurse} where day ∈ [0, p−1], numWeekends ∈ Z+, and nurse ∈ N . Here nurse is working too many

weekends with in the period from day for numWeekends.

WeekendsInARow: vParamSet = {day, numWeekends, nurse} where day ∈ [0, p−1], numWeekends ∈ Z+, and nurse ∈ N . Here nurse is working too many

weekends in a row from day for numWeekends.

WeekendSplit: vParamSet = {day, nurse} where day ∈ [0, p− 1] and nurse ∈ N .

Here nurse is a working only one of the days of the weekend on day.

Following the previous example of a MaxDaysOn constraint (4.10), a violation

involving nurse8 starting on day 0 for six days would be written:

〈MaxDaysOn, {0, 6, nurse8}〉 (4.12)

4.4 Repairs

In this section the actions that can be used to change the shift assignments of nurses

are defined. A repair is defined as an action which alters the assignment of shifts for

the nurses in N . The notation used is:

repairn = 〈rTypen, rParamSetn〉 (4.13)

where rTypen is the type of repair and rParamSetn is a set of parameters specific to

the type of repair.

Four different repair operations have been identified for this rostering problem.

They are the natural repairs that are most commonly used by rostering experts.

Figure 4.2 gives a graphical examples of three of the repairs. They are defined below

along with the parameter sets used to describe them.

Reassign: rParamSet = {nurse, day, shift} where nurse ∈ N , day ∈ [0, p − 1],

and shift ∈ {U,E,L,N}. Action: Assign shift to nurse on day.

Figure 4.2: Basic repair action types

Swap: rParamSet = {nurse1, nurse2, day} where nurse1, nurse2 ∈ N and day ∈[0, p − 1]. Action: Interchange the shift assignments of nurse1 and nurse2 on

Switch: rParamSet = {nurse, day1, day2} where nurse ∈ N and day1, day2 ∈[0, p − 1]. Action: Interchange the shift assignments of nurse on day1 and

Ignore: rParamSet = ∅. Action: Do nothing.

For example, a repair which assigns the EARLY shift to nurse7 on day 3 would

be written:

〈Reassign, {nurse7, 3,E}〉 (4.14)

4.5 Problem and Solution Spaces

The formulation of the nurse rostering problem as described in this chapter allows

for the development of novel methods. The definitions of constraint violations and

repairs as identifiable objects reflects the reality of expert decision making for these

types of real-world problems. Constraint violations are modelled as problems that

must be solved and the actions that are used to repair them are the solutions. Given

this interpretation it is sensible to define problem and solution spaces in order to

complete the formulation.

The set of violations of constraints in a particular instance of a rostering problem

and the set of possible repairs that can be made to the roster are described as the

instance spaces of violations and repairs, respectively.

Given R = 〈N, C〉, the instance space of violations, PRv , is defined as the union of

all the sets of violations generated by applying the constraints in C to N :

φk∈C

φ(constraintk, N) (4.15)

The instance space of repairs, PRr is the set of all possible repairs given the set of

nurses N . This can be formulated using the following sets of repairs:

AReassign = {〈Reassign, {nurse, day, shift}〉} (4.16)

for all nurse ∈ N , day ∈ [1..p], and shift ∈ {U,E,L,N},

ASwap = {〈Swap, {nurse1, nurse2, day}〉} (4.17)

for all nurse1, nurse2 ∈ N and day ∈ [1..p], and

ASwitch = {〈Switch, {nurse, day1, day2}〉} (4.18)

for all nurse ∈ N and day1, day2 ∈ [0, p− 1].

Then PRr is defined:

PRr = AReassign ∪ ASwap ∪ ASwitch ∪ {〈Ignore, ∅〉} (4.19)

These instance spaces are dynamic in the sense that every time a violation is

repaired their contents will change. It is likely that many of the repairs that are used

to repair violations will in turn cause more violations. Any changes to the roster will

also alter the effect of the repairs in PRr .

The nurse rostering problem modelled in this way can be seen as the problem of

finding suitable members of PRr for each violation in PR

v . If it is possible to find a PRv

with no members then the roster a complete solution will be found.

4.6 Problem Data

The problem description and data were elicited from the QMC ward through a series

of interviews with the senior nursing staff. During these interviews the nature of the

rostering problem at the QMC was discussed. In particular, the constraints were

discussed in great detail. The existing system used by the ward was also discussed.

The preference rosters and information about nurses was provided for a number

of months. This real-world data is used for the experiments presented in this thesis.

Seven months of data are used in the experiments - from March 2001 until September

2001. Throughout the development of the research the number of constraints that

were applied to the problems has grown. Following is a list of the constraints that

apply to the problems for each chapter.

Chapter 5 Basic Cover and MaxHours constraints. These experiments were used to

investigate the potential of the case-based method in the early stages:

1. Cover: EARLY shifts require 4 Qualified Nurses

2. Cover: EARLY shifts require 1 Registered Nurse

3. Cover: EARLY shifts require 1 Eye-Trained Nurse

4. Cover: EARLY shifts require 1 Auxiliary Nurse

5. Cover: LATE shifts require 3 Qualified Nurses

6. Cover: LATE shifts require 1 Registered Nurse

7. Cover: LATE shifts require 1 Eye-Trained Nurse

8. Cover: LATE shifts require 1 Auxiliary Nurse

9. Cover: NIGHT shifts require 2 Qualified Nurses

10. Cover: NIGHT shifts require 1 Eye-Trained Nurse

11. Cover: NIGHT shifts require 1 Auxiliary Nurse

12. MaxHours: The maximum number of hours any nurse may work in a

fortnight (14 days) is 75

Chapters 6, 7 MaxDaysOn, MinDaysOn, Succession and contract based MaxHours

constraints. These constraints made the problems considerably more realistic

and provide a good basis for testing meta-heuristic hybrid algorithms:

1. MaxDaysOn: The maximum number of consecutive shifts for nurses of any

type is 6

2. MinDaysOn: The minimum number of consecutive shifts for full time

nurses is 2

3. Succession: An EARLY shift must not follow a NIGHT shift

4. MaxHours: The maximum number of hours any nurse may work is as set

in their contracts

Chapter 8 Fully constrained problems. These problems included all of the con-

straints that were discussed with the senior nurses at the QMC ward:

1. HardRequest: Hard requests should be satisfied

2. MinHours: The minimum number of hours any nurse may work is 5 less

than the maximum set in their contract

3. SingleNight: Nurses can not work only one night shift

4. SoftRequest: Soft requests should be satisfied

5. WeekendBalance: Nurses may work no more than 3 weekends out of every

6. WeekendsInARow: Nurses may work no more than 3 weekends in a row

7. WeekendSplit: Nurses must work either both days of a weekend or not at

It should be noted that in the early experiments nurse preference requests were

not modelled as constraints because they were incorporated into the solutions through

the definitions of features used to choose repairs.

Figure 4.3 gives a sample preference roster for Registered and Enrolled nurses for

March 2001. This shows the division of the ward into various teams. The names of

the nurses have been changed for data protection reasons. The letters in brackets after

the nurses’ names indicates their qualification level and eye-training status. The shift

label ‘AL’ is an abbreviation for annual leave. When this is indicated in the preference

roster no rostering action can change this request. Of interest is the wide variation

in the amount of preferences that nurses provide - and this is independent of the

qualifications or specialty training of the nurse. Although it is not indicated in this

figure the number of hours per week in the nurses’ contracts does have an affect on

the level of detail they provide in the preference roster.

The preference rosters are used as a starting point for the rostering process. The

constraint violations in the preference roster are repaired to produce the final roster.

The number of violations present in a preference roster is usually very high. In general,

Cover, MinHours, and MinDaysOn violations are present in the highest numbers. On

average, there are around 100 of these violations in the preference roster.

4.7 Conclusion

In this chapter the basic problem formulation has been introduced. The different

types of nurses have been defined along with the shifts that they can be assigned.

The self-rostering paradigm has been introduced as the mechanism for collecting

nurse preferences. Constraints and constraint violations have been formulated as

data structures which represent the problems in the current roster. A number of

simple repair actions have been defined which are used in order to solve constraint

violations.

Figure 4.3: Sample preference roster

The formulations used in this chapter are motivated by the case-based reasoning

approach which will be defined in Chapter 5. In particular the correspondence be-

tween constraint violations and repairs will be explored as a means by which to solve

rostering problems.

Chapter 5

Case-based Repair Generation

5.1 Introduction

This chapter will introduce the basic framework of the case-based reasoning approach

to personnel rostering. This approach operates on the mathematical model of the

nurse rostering problem defined in Chapter 4. The methods described in this chapter

do not aim to solve entire rostering problems, but rather define a mechanism for

repairing subproblems which can then be combined to solve full problems. This case-

based mechanism will be used within hybrid algroithms described in Chapters 7 and

The subjectivity and complexity of personnel rostering problems make case-based

reasoning an ideal modelling tool. The ‘learning by example’ framework that CBR

provides can be used to elicit rostering knowledge from experts in a very natural

manner. The CAse-BAsed ROSTering (CABAROST) method was developed as a

mechanism for capturing examples of individual constraint violations and the repairs

that were used by human experts to solve them [188]. This rostering knowledge is then

used for automated rostering and as a decision support system for senior nurses. The

automated methods can use the knowledge to repair constraint violations in rosters

starting either from the initial roster of nurse preferences or from a final roster which

has been disrupted by events such as staff illness or an increase in patient numbers.

In the interactive mode the knowledge can be used to guide nurses towards rostering

decisions that they have made before, whilst allowing them to change and adapt their

5. case-based repair generation 107

responses to unique circumstances.

Violations and repairs are stored as cases in a case-base and are used to solve

new violations in new rosters. When a new violation is identified in a roster the case

containing the most similar violation in the case-base is retrieved. The repair from

the retrieved case is used to generate a set of possible repairs that could be performed

on the current roster. These repairs are then compared with the repair from the

retrieved case and the most similar is selected.

The case-base is used to store historical problem solving knowledge in a generalised

context. The violations and repairs it contains must be independent of any individual

problem instance. This is essential if the experience stored about solving violations

in previous problem instances is to be used to solve violations in new instances. This

can be illustrated by considering a MaxDaysOn violation involving a particular nurse,

nurse10, in N . In order to solve this violation cases containing similar violations are

retrieved from the case-base. It is clear that the applicability of past experience is

increased if the search is not limited to previous violations involving only nurse10.

Violations of the same type involving other nurses must also be considered. The

information stored within the case-base about violations and repairs is deliberately

not roster specific. Hence, the experience stored about repairing the violations in one

roster can be used to solve violations (of the same type) in any other roster, regardless

of any staffing differences.

This chapter is arranged as follows. In Section 5.2 the issues of knowledge gen-

eralisation and representation are addressed. Sections 5.3 and 5.4 describe the re-

trieval and adaptation algorithms and a simple example is presented in Section 5.5.

An alternative representation of repair information is described in Section 5.6. The

performance of the CABAROST method is demonstrated with some simple initial

experiments in Section 5.8.

5.2 Case Structure

The case-base is a database of previous violations of constraints and their corre-

sponding repairs. Each case represents an individual problem solving episode which is

defined for the rostering problem as the repair of a single constraint violation. Cases

store the characteristics of the violations which are used by the rostering experts to

determine the action that will be used to address them. They also store information

about the repairs that were used to solve violations which is used in the future to

generate solutions to new problems.

Typically, a case-base will contain between 100 and 1000 cases. The number of

cases depends mainly on the number of constraints defined for the problem - more

constraints will require more cases to represent the different ways in which the con-

straint’s violations should be repaired. The number of cases needed for each constraint

will vary according to the type. The repairs of some constraints may not require the

consideration of many alternatives (for example if a constraint must be satisfied in

nearly every circumstance) and so would not require many cases to represent this

behaviour. Other constraints may depend more heavily on the circumstances sur-

rounding them and complex interactions between the shifts already rostered and any

preferences indicated by the nurses must be considered. Such constraints may require

many cases to represent all of the possible combinations.

Each case in the case-base consists of a pair representing a violation and the repair

that was used to solve it. The case-base can be represented as the Cartesian product

of the space of problems and the space of solutions. For the nurse rostering problem

this is defined:

CB = Wv ×Wr (5.1)

where Wv is the set of previously encountered constraint violations and Wr is the set

of repairs of these violations. A case is therefore defined as an ordered pair:

caseγ = 〈vγ, rγ〉 , γ ∈ Γ (5.2)

where vγ and rγ represent a violation and a repair, respectively.

In the case-base, a generalised violation is defined as a duple

vγ = 〈vStructureγ, vγ〉 , γ ∈ Γ (5.3)

where vStructureγ represents structural information about the violation being stored

and vγ is a vector of violation feature values.

When a problem solving episode is stored in a case the structural information

about the violation is extracted from the roster through a process called generalisa-

tion. This process identifies information about the violation which is not specific to

the current roster. For example, only the NurseType information is stored about

any nurses that are involved in the violation. The type of the violation and any shifts

involved are also stored in this structure.

In general the vStructureγ variable is defined:

vSructureγ = 〈cTypeγ, NurseTypeγ, vParamSetγ〉 (5.4)

where cTypeγ is the type of constraint which was violated, NurseTypeγ is the type

information about the nurse involved, and vParamSetγ is a set of parameters which

depends on the constraint type.

Following are the details of the parameter sets stored for each of the example

constraint violations that were given in Chapter 4.

Cover: vParamSet = {shift} where shift ∈ {E,L,N}. There were insufficient

nurses of type NurseType assigned to the indicated shift.

HardRequest: vParamSet = {assigned, requested} where assigned ∈ {U,E,L,N}and requested ∈ {O,E,L,N}. A nurse had asked to be assigned the requested

shift but instead was given the assigned shift.

MaxDaysOn: vParamSet = ∅.

MaxHours: vParamSet = ∅.

MinDaysOn: vParamSet = ∅.

MinHours: vParamSet = ∅.

SingleNight: vParamSet = ∅.

The Cover violation describing a shortage of Reg-istered Nurses for the early shift on day 3 is

〈Cover, {〈RN, ∗, ∗, ∗, ∗〉, 3,E}〉,

and is generalised using θRv to be

〈Cover, 〈RN, ∗, ∗, ∗, ∗〉, {E}〉 .

Figure 5.1: Example: Generalisation of a Cover Violation

SoftRequest: vParamSet = {assigned, requested} where assigned ∈ {U,E,L,N}and requested ∈ {O,E,L,N}. A nurse had asked to be assigned the requested

shift but instead was given the assigned shift.

Succession: vParamSet = {shift1, shift2} where shift1, shift2 ∈ {E,L,N}. A

nurse was assigned shift1 followed by shift2.

WeekendBalance: vParamSet = ∅.

WeekendsInARow: vParamSet = ∅.

WeekendSplit: vParamSet = ∅.

Given R = 〈N, C〉 and violation problem instance space PRv we define the gener-

alisation as a function, θRv , such that

θRv : PR

v −→ Wv (5.5)

When a violation is generalised the NurseType and other parameter information

is extracted from it to store in the case-base. Figures 5.1 to 5.3 show examples of the

generalisation process for the non-trivial generalisations (those violation types with

non-empty parameter sets).

The vector of feature values, vγ, stores a range of different measurements of the

roster at the time which the violation took place. These features were initially de-

signed through consultation with the rostering experts at the QMC. A subset of the

The HardRequest violation describing the assign-ment of a shift other than that requested by nurse8

on day 6 is

〈HardRequest, {6, nurse8}〉,

〈HardRequest, 〈EN, F, H, ET, D〉, {E,O}〉,

where NurseType8 = 〈EN, F, H, ET, D〉, s8,6 =E is the shift assigned to nurse8 on day 6, andp8,6 = O was the shift that nurse8 had requestedon day 6.

Figure 5.2: Example: Generalisation of a HardRequest Violation

The Succession violation describing the assign-ment of an illegal pair of shifts (EARLY afterNIGHT) on days 11 and 12 to nurse2 is

〈Succession, {11, nurse2}〉,

〈Succession, 〈RN, M, H, NT, D〉, {N,E}〉

where NurseType2 = 〈RN, M,H, NT, D〉, ands2,11 = N, s2,12 = E were the illegal shift assign-ments.

Figure 5.3: Example: Generalisation of a Succession Violation

features was then selected automatically from the larger candidate set using a data

mining algorithm. A description of this algorithm, and a detailed discussion of the

different features used, is given in Chapter 6.

The vector of features represent the characteristics of violations which affect the

choice of repairs that will be used to solve them. They are the factors which human

experts take into account when making rostering decisions. They differ from the

structural information stored about violations, which represent clearly identifiable

differences in violation type. Feature values represent the characteristics of the roster

which cannot be systematically linked to the repair outcomes without the input of

the domain expert.

The features fall into one of three categories:

Statistical information is stored about various aspects of the roster. This includes

measurements of the current levels of shift assignment (as percentages of the

total number of assignable hours available), nurse preference satisfaction, the

magnitude of the violation, and the total number of constraint violations. Nurse

preference satisfaction levels are measured as the proportion of nurses shift

requests which are satisfied. The measurement of violation magnitude depends

on the type of constraint. For example, the magnitude of a cover violation is

shortfall in the number of nurses required. The magnitude of a MaxDaysOn

violation is the number of days assigned over the maximum limit specified.

These measurements can be taken over the whole roster or over subsets of

nurses and days, depending on the violation being stored.

Cover information is the measurement of the number of nurses working on a specified

shift. This is subdivided by nurse type and the shifts measured depend on the

type of violation. The ‘cover’ of UNASSIGNED and OFF shifts is also measured

in certain instances - when it is important to record the number of nurses who

are not currently assigned on-shifts on a particular day.

Shift Patterns are recorded on and around days during which violations have oc-

curred. This is particularly useful for violations which involve a single nurse

over a relatively short period - for example Succession and MinDaysOn viola-

tions. The knowledge of shift patterns allows rostering decisions to be made

which take into account the assignments that nurses have on the days either

side of the day on which the violation occurs.

The repair that was used to solve the violation during a recorded problem solving

episode is also stored in the case in a generalised form:

rγ = 〈rStructureγ, rγ〉 , γ ∈ Γ (5.6)

where rStructureγ describes structural information about the repair, and rγ is a

vector of repair feature values.

The structural information about a repair is information about the types of nurses

and shifts that were involved in the repair. In general this takes the form:

rStructureγ = 〈rTypeγ, rParamSetγ〉 (5.7)

The parameter sets stored in the case-base about each repair type, with respect

to the definitions of the actual repairs being generalised, are:

Reassign: rParamSet = {NurseType, newShift, oldShift} where NurseType is

taken from the nurse field of the actual repair, oldShift is the shift that the

nurse originally had on the day of the repair, and newShift is what they were

reassigned.

Swap: rParamSet = {NurseType1, NurseType2, shift1, shift2} where NurseType1

and NurseType2 are taken from the nurse1 and nurse2 fields of the actual re-

pair, respectively, and shift1 and shift2 are the shifts originally assigned to

each of the nurses on the day of the repair.

Switch: rParamSet = {NurseType, shift1, shift2} where NurseType is taken

from the nurse field of the actual repair, shift1 was the nurses shift on day1

of the repair, and shift2 was the nurses shift on day2.

Ignore: rParamSet = ∅. No repair was performed.

The Reassign repair describing the assignment ofan early shift to nurse6 on day 4 is

〈Reassign, {nurse6, 4,E}〉,

and is generalised using θRr to be

〈Reassign, {〈EN, F, H,ET, D〉,E,U}〉

where NurseType6 = 〈EN,F, H, ET,D〉, ands2,4 = U was the shift assigned to nurse2 beforethe repair.

Figure 5.4: Example: Generalisation of a Reassign Repair

Repairs are generalised using the same principal that was used to generalise vio-

lations. The generalisation function for repairs, θRr , is a mapping:

θRr : PR

r −→ Wr (5.8)

During generalisation the parameter information is extracted from the repairs

when they are stored in the case-base. Figures 5.4 to 5.6 show examples of the

generalisation process for the non-trivial generalisations (all repairs expect for Ignore

repair).

The repair indices are statistical information about the parameters of the repair

- the cover of shifts before and after the repair and the utilisation and satisfaction of

the nurses involved. Tables 5.1 to 5.3 list the features used for Reassign, Swap, and

Switch repairs. No features are defined for Ignore repairs. In these tables the ‘#’

symbol denotes ‘number of’.

5.3 Case Retrieval

The case retrieval phase begins after a violation has been chosen from the roster.

This violation is called the focus violation. Violations are generally chosen at random

when rosters are being solved by an automated algorithm. When CABAROST is

Table 5.1: Reassign repair features

Given 〈Reassign, {nursei, day, shift}〉:1) # nurses assigned shift on day2) # nurses of type NurseTypei assigned shifton day3) # nurses assigned si,day on day4) # nurses of type NurseTypei assigned si,day

on day5) Assigned/Contract Hours for nursei

6) Shift pattern for nursei for two days aroundday

Table 5.2: Swap repair features

Given 〈Swap, {nursei, nursej, day}〉:1) # nurses assigned si,day on day2) # nurses assigned sj,day on day3) # nurses of type NurseTypei assigned si,day

on day4) # nurses of type NurseTypei assigned sj,day

on day5) # nurses of type NurseTypej assigned si,day

on day6) # nurses of type NurseTypej assigned sj,day

8) Assigned/Contract Hours for nursej

9) Shift pattern for nursei for two days aroundday10) Shift pattern for nursej for two days aroundday

The Swap repair describing the interchanging ofthe shifts assigned to nurses nurse6 and nurse1 onday 18 is

〈Swap, {nurse6, nurse1, 18}〉,

〈Swap, {〈AN, F, I,NT, B〉, 〈AN,M, H,NT, B〉,L,O}〉

where NurseType6 = 〈AN,F, I, NT,B〉,NurseType1 = 〈AN, M,H, NT,B〉, and s6,18 = Land s1,18 = O were the shifts assigned to nurse6

and nurse1, respectively, before the repair.

Figure 5.5: Example: Generalisation of a Swap Repair

used in interactive mode the user can choose the violation they wish to solve from a

After the focus violation has been chosen from the set PRv it must first be gener-

alised using the θRv function so that it can be compared to the historical violations in

the case-base. The retrieval algorithm (Figure 5.8) involves two searches of the case-

base. The first search identifies the cases in the case-base whose violations match the

structural information of the focus violation. This ensures that the retrieved cases

contain only violations of the same type, and involving the same types of nurses and

shifts, as the focus violation.

The first search results in a trimmed case-base consisting of cases containing only

matching violations. It is possible that after the first search no cases are returned

in the trimmed case-base. This is an indication that the case-base does not contain

sufficient experience to solve the current violation. The user would need to supply

some examples which cover this type of constraint. It is desirable that the method

behave in this manner - information about the repairing of other constraints could

not be successfully used in most instances. More details about the on-going training

of the case-base in a semi-automated setting is given in Chapter 8.

The Switch repair describing the interchanging ofthe shifts assigned to nurse18 on days 7 and 9 is

〈Switch, {nurse18, 7, 9}〉,

〈Reassign, {〈RN,F,H, ET, E〉,U,N}〉

where NurseType18 = 〈RN,F, H, ET,E〉, ands18,7 = U and s18,9 = N were the shifts assignedto nurse18 on days 7 and 9 before the repair.

Figure 5.6: Example: Generalisation of a Switch Repair

The matching violations in the trimmed case-base are ranked according to the

distance of their feature index vector vγ from the feature index vector of the focus

violation, vα. Given a vector of feature weights w and the case weight Wγ, distance

is calculated using a weighted nearest neighbour function:

dv(vγ, vα) = Wγ

√√√√F−1∑i=0

w[i]fi(vγ[i], vα[i])2 (5.9)

where fi(vγ[i], vα[i]) is the distance function for the ith feature and depends on the

feature type, and F is the number of features. The distance functions for each feature

are defined according to the nature of the data used to represent it. For real-valued

data the standard difference measure is used (i.e. |b − a|). For more structured

data specialised distance measures have been developed and are described in detail

in Chapter 6. In particular, the distance between two shift patterns is determined

using an adaptation of the Hamming distance.

The distance measure described by Formula 5.9 returns a value indicating the

dissimilarity of the two cases. Cases containing two identical violations will have zero

distance between them. It should be noted that all of the distance functions used to

measure the distance between individual feature values normalise the output to the

interval [0,1]. For example, real-valued data is normalised using the minimum and

Table 5.3: Switch repair features

Given 〈Switch, {nursei, day1, day2}〉:1) # nurses assigned si,day1 on day12) # nurses assigned si,day2 on day13) # nurses assigned si,day1 on day24) # nurses assigned si,day2 on day25) # nurses of type NurseTypei assigned si,day1

on day16) # nurses of type NurseTypei assigned si,day2

10) Shift pattern value for nursei around day111) Shift pattern value for nursei around day2

• Statistical features • Shift patterns

CASE (caseγ ) VIOLATION (v γ ) STRUCTURE (vStructure γ ) FEATURE INDICES (v γ )

REPAIR (r γ ) STRUCTURE (rStructure γ ) FEATURE INDICES (r γ )

• Constraint violation type (cType γ ) • Nurse type (NurseType γ ) • Violation parameters (vParamSet γ ) • Statistical features • Cover information • Shift patterns

• Repair type (rType γ ) • Repair parameters (rParamSet γ ) Figure 5.7: Case structure

Given violationα and case-base CB,1: CBTEMP ← ∅2: generate vα = 〈vStructα, vα〉 ← θR

v (violationα)3: for all caseγ = 〈vγ, rγ〉 ∈ CB, where vγ = 〈vStructγ, vγ〉 do4: if vStructγ = vStructα then CBTEMP ← CBTEMP ∪ {caseγ}5: if CBTEMP = ∅ then return false6: generate an array vDistance[|CBTEMP|]7: for all caseγ = 〈vγ, rγ〉 ∈ CB, where vγ = 〈vStructγ, vγ〉 do8: vDistance[γ] ← dv(vγ, vα)9: return vDistance and CBTEMP, both sorted according to vDistance

Figure 5.8: The retrieval algorithm

maximum values for the feature present in the case-base.

Figure 5.8 shows the pseudo-code for the retrieval algorithm. In line 1 the trimmed

case-base CBTEMP is initialised and in line 2 the focus violation is generalised using

Formula 5.5. CBTEMP is filled in lines 3 and 4 with cases whose violations have the

same structural information as the generalised focus violation. The search fails in

line 5 if no such cases can be found. An array of real numbers is generated of the

same size as the trimmed case-base in line 6. In lines 7 and 8 this array is filled with

the distances between the feature vectors of the cases in CBTEMP and the feature

vector of the generalised focus violation. The algorithm sorts both the distance array

vDistance and the trimmed case-base according to the values in the distance array,

before returning them in line 9.

For an example of the retrieval phase consider a violation of a Cover constraint

which requires that at least four qualified nurses be assigned to the EARLY shift.

The focus violation is first generalised to extract the information needed to compare

it to other cases in the case-base, including its structural information and its fea-

ture vector. The case-base is initially searched for all examples of Cover violations

involving qualified nurses. This trimmed case-base is then ranked according to the

distance that each of the violation feature vectors is from the feature vector of the

focus violation. The result is an ordered list of cases which contain examples of Cover

violations involving qualified nurses, ordered according to the similarity to the Cover

violation being solved.

In this chapter, and throughout the thesis, retrieval of cases from the case-base

is presented as a search which iterates through the entire case-base in order to find

the nearest case(s). This approach is adopted for ease of explanation. However,

such a naive approach to case-base retrieval would lead to a linear increase in search

time as the size of the case-base increases. In the implementation of CABAROST a

multi-dimensional binary search tree (or kd-tree)is in fact used to store cases in the

case-base. Such a structure allows searches to miss portions of the case-base which will

not yield similar cases, and generally provides solutions in logarithmic expected time.

Although the structure was adapted for use with CABAROST data structures it is

not described in detail in this thesis, due to the additional complexity it introduces

to the description of algorithms. Further information about kd-trees and their use in

case-based reasoning can be found in [39, 50, 97, 245].

5.4 Repair Adaptation

The retrieval algorithm returns a sorted case-base of relevant cases, each of which

contains information about a repair. In order to generate a repair for the focus

violation these generalised repairs must be adapted to the context of the current

roster.

Before discussing the adaptation of repairs let us consider the example given for

the retrieval phase where the sorted cases all contained the repairs that were used to

solve cover violations involving qualified nurses. Consider the first case in the sorted

case-base, which could, for example, contain information about a Reassign repair,

involving a qualified nurse, who had their shift assignment on the day of the violation

changed from UNASSIGNED to EARLY. In order to adapt this repair information

to the current roster a set of Reassign repairs must be created, one for each qualified

nurse in the roster whose current shift assignment is UNASSIGNED on the day of

the violation. This set of repairs is then ranked according to the distance from the

repair in the most similar case and the repair with the smallest distance is chosen.

The generation of a new repair from the sorted case-base is performed by the

adaptation algorithm (Figure 5.9). This algorithm initially generates a set of candi-

date repairs from each of the cases in the trimmed case-base. This set is a subset of

PRr , the set of all possible repairs of roster R. The candidate repairs are of the same

type as the repairs from the cases they are adapted from and use the same types of

nurses and shifts (the information in the rStructure field of the repairs stored in the

cases). They must also be adapted to meet the requirements of the focus violation.

In particular, it is ensured that any nurses specified in the focus violation are also

included in the repairs. For example the repairs generated for a cover violation will

always involve the day on which the violation occurs. Likewise, for a MaxDaysOn

violation the generated repairs will always include the nurse who was working too

many days in a row.

A nearest neighbour function is used to measure the distance between two repairs

which is similar to that used for violation distance. Given the candidate repair feature

vector rβ and the feature vector rγ from the retrieved case:

dr(rγ, rβ) =

√√√√F−1∑i=0

fi(rγ[i], rβ[i])2 (5.10)

where fi(rγ[i], rβ[i]) is the distance function for the ith feature and depends on the

feature type, and F is the number of features.

Figure 5.9 shows the pseudo-code for the repair adaptation algorithm. In line 1

the set of candidate repairs Candidates is initialised. For the first k cases in the

trimmed case-base CBTEMP a set of candidate repairs is generated and added to the

Candidates array in lines 2 through 8. These repairs are generated by the function

GenerateCandidates, which takes the current violation and the structure of the

repair from the retrieved case as parameters. If no repairs can be generated from

the retrieved cases then the process fails in line 9. In line 10 an array of distances is

created of the same size as the array of candidate repairs. The distance between each

of the candidate repairs and the repair from the cases they were generated from is

calculated in lines 11 to 14. Finally, the list of candidate repairs is sorted according

to distances and returned in line 16.

The function GenerateCandidates generates a set of repairs of the same type as

Given CBTEMP - the trimmed set of cases;vDistance - the set of violation distances;k - a search parameter;violationα - the focus violation:

1: Candidates ← ∅2: i ← 03: while |Candidates| = 0 do4: casei = 〈vi, ri〉 ← CBTEMP[i], where ri = 〈rStructi, vSeti〉5: Candidates ← Candidates ∪ GenerateCandidates (violationα, rStructi)6: i ← i + 1.7: if i = k then break8: end while9: if |Candidates| = 0 then return false

10: generate an array rDistance[|Candidates|]11: for all repairβ ∈ Candidates do12: rβ = 〈rStructβ, rβ〉 ← θR

r (repairβ)13: rDistance[β] ← dr(rγ, rβ)14: end for15: return Candidates sorted according to rDistance

Figure 5.9: Adaptation algorithm

the repair from the retrieved case. The repairs that it generates will always feature

the nurses and shifts involved in the violation and this ensures that the adapted

repairs are relevant to the problem currently being addressed. The function applies

one of a set of rules depending on the type of the focus violation and the repair from

the retrieved case. These rules force the generated repairs to adhere to some basic

principals. For example, they ensure that repairs of HardRequest and SoftRequest

violations take place on the day that the violation took place (a repair on any other

day would definitely not solve the violation). The rules are designed to allow the

user maximum choice whilst ensuring that the repairs generated will have the desired

effect.

Figures B.1 to B.12 give the adaptation rules for each of the 36 different possible

combinations of repair and violation types, and can be found in Appendix B. They

are grouped according to the parameters of the focus violation (given at the top

of each figure). Given in each figure is a table whose columns represent the value

assigned to the parameters of the repair. Some combinations are used rarely because

they represent decisions which unlikely be made by rostering experts (for example

a switch repair of a cover violation). They are included to guarantee the maximum

level of expressiveness possible for the decision maker.

To illustrate the use of adaptation rules a simple example is given. When reassign

repairs are generated for cover violations three parameters must be set, namely the

nurse, day, and new shift assignment. The corresponding adaptation rule states that

the day and shift must be as they are specified in the violation being repaired. The

nurse must be selected from the set of nurses in the roster who’s NurseType is the

same as that specified in the generalised repair from the retrieved case. In fact,

one repair is then generated for each nurse in this set and the ranking part of the

adaptation algorithm will determine the distance of each of these repairs from the

retrieved repair.

The adaptation algorithm returns a sorted list of repairs. When CABAROST is

used in interactive mode these repairs are presented to the user as options. The deci-

sion maker may choose one of the generated repairs or they may make a modification

if none of the repairs are suitable. If the later occurs then the new violation/repair

Table 5.4: A simple example of a nurse roster

0 1 2 3 4 5 6nurse0 E U U U L E Enurse1 L L L U U L Unurse2 U E E L E U L

combination can be stored in the case-base. In the automated mode the first repair

on the list is used.

5.5 Example

To illustrate the retrieval and adaptation phases a simple example will be given.

Consider the roster in Table 5.4 where E = EARLY, L = LATE, and U = UNAS-

SIGNED. Here nurse0 and nurse1 are registered, female, non-international, eye-

trained, E-grade nurses and nurse2 is an enrolled, female, non-international, eye-

trained, D-grade nurse. Hence, NurseType0 = NurseType1 = 〈RN, F, H,ET, E〉and NurseType2 = 〈EN, F, H, ET, D〉.

A single constraint is applied to the roster, requiring that a minimum of 1 qualified

nurse is assigned to every early shift. It can be seen that on day 3 there is no nurse

assigned to the early shift. Therefore the violation that needs to be repaired is

violationα = 〈Cover, {〈QN, ∗, ∗, ∗, ∗〉, 3,E}〉, (5.11)

where ‘*’ in the feature set indicates that a feature can take any value. The violation

is passed to the retrieval phase where all the examples of cover violations involving

qualified nurses are identified in the case-base. The retrieval phase first converts the

violation into its generalised form by calculating the violation feature indices and

instantiating the structural information as follows:

θRv (violationα) = 〈vStructureα, vα〉 = 〈〈Cover, 〈QN, ∗, ∗, ∗, ∗〉, {E}〉, vα〉, (5.12)

The case-base is then searched for cases containing violations that match vStructureα.

Table 5.5: Example of case ranking

Feature Index Values d (distance)V-Mag D-G-Ass D-L-Ass D-G-Un D-L-Un

α 1 32.0 16.0 52.5 37.5 NAcase0 1 35.5 20.0 48.0 40.0 0.106case1 2 54.0 48.0 32.0 16.0 0.774case2 1 56.0 56.0 12.5 8.0 0.974

Suppose, for this example, that three such cases are found. These cases are ranked

according to the distance measure (Formula 5.9) and the results are described in

Table 5.5.

For the clarity of this example all of the violation feature indices described are

real or integer valued. In reality more complex feature index types are used, and

these will be described in detail in Chapter 6 and in Appendix A. For this example

the violation feature indices are:

• V-Mag - The magnitude of the constraint violation - for cover violations this

is the difference between the number of nurses of the required type and the

number currently assigned;

• D-G-Ass - The number of hours assigned to nurses of any type on the day of

the violation;

• D-L-Ass - The number of hours assigned to nurses of the required type (i.e.

according to the NurseType variable of the cover constraint) on the day of the

violation;

• D-G-Un - The number of unassigned hours, based on the number of hours a

nurse is contracted to work each day, which could be assigned shifts on the day

of the violation, for nurses of any type;

• D-L-Un - The number of unassigned hours for nurses of the required type.

The case with the smallest distance from the focus problem is retrieved from the

case-base, case0 = (v0, r0), where

v0 = 〈〈Cover, 〈QN, ∗, ∗, ∗, ∗〈, {E}〉, {1, 35.5, 20.0, 48.0, 40.0}〉 , and (5.13)

r0 = 〈〈Reassign{〈RN,F,H, ET, E〉,E,U}〉, {1, 0, 3, 2, 85.0, 0}〉 (5.14)

The repair information stored in case0 must now be adapted to create a repair for

the current violation. In this example case0 contains a (generalised) repair of type

Reassign that uses a nurse with NurseType = 〈RN,F, H, ET,E〉 (i.e. a registered,

female, non-international, eye-trained, E-Grade nurse) who was originally assigned an

UNASSIGNED shift on the day of the repair. The adaptation phase generates a set

of reassign repairs using nurses with the correct parameter set and who are currently

assigned UNASSIGNED on day 3 (see Figure B.1 from Appendix B. In the roster

given in Table 5.4 we have two nurses with such characteristics and therefore two

repairs are generated:

repaira = 〈Reassign, {nurse0, 3,E}〉; (5.15)

repairb = 〈Reassign, {nurse1, 3,E}〉. (5.16)

We generalise these repairs by calculating index sets and compare them to the

repair from case0. These repairs are then ranked according to the distance between

the repair feature indices, as described in Table 5.6. The repair indices are:

• SCOA - The number of nurses of all types assigned to the original shift (i.e.

UNASSIGNED) on the day of the repair;

• SCOT - The number of nurses of the NurseType described in the repair assigned

to the original shift on the day of the repair;

• SCNA - The number of nurses of all types assigned to the new shift (i.e. EARLY)

on the day of the repair;

• SCNT - The number of nurses of the NurseType described in the repair assigned

to the new shift on the day of the repair;

Table 5.6: Example of repair ranking

repair Feature Index Values d (distance)SCOA SCOT SCNA SCNT Util SP

RepairIndices0 1 0 3 2 85.0 0 NArepaira 2 2 0 0 80.0 1 0.812repairb 2 2 0 0 100.0 4 1.276

• Util - The percentage of contract hours assigned to the nurse in the repair;

• SP - A shift pattern distance score which sums the difference in shift assignments

over two equal length periods (see Formula 6.2).

Therefore the first repair, 〈Reassign, {nurse0, 3,E}〉, is returned as a solution.

The case containing the constraint violation and the applied repair is then added

to the case-base by generalising both the violation and repair, thus increasing the

case-base’s experience.

The example here is simpler than many encountered for ease of explanation. Other

combinations of violation and repair types involve a more complex search for candi-

date repairs. However all the principals needed for more complex instances are the

5.6 An Extended Adaptation Algorithm

The adaptation algorithm described in Section 5.4 was developed early in the research

and performed well for problems involving smaller numbers of constraints. For the

more realistic problems that were tackled later in the research a different adaptation

algorithm was developed.

The notion of similarity between repairs used in the described algorithm was

based on the distance between vectors of repair features. These features described

the state of the roster before and after the repair took place. They do not directly

represent the damage that may be done to the roster by applying the repair (i.e. the

new violations that the repair causes). The improved adaptation algorithm stores

generalised versions of the violations that are caused when a repair is applied to the

roster. The similarity between repairs is then defined in terms of the distance between

the violations that it causes.

Before describing the altered case structure it is necessary to define a method for

determining the violations that are caused by a repair. The function damageR takes

a repair and returns the set of violations that are caused by it:

damageR(repairβ) = {violationm : violationm is caused by repairβ} (5.17)

When a case is stored in the case-base the repair is generalised by extracting the

structural information and generalising all of the violations in the set returned by

damageR. Therefore the repair part of a case is now defined:

rγ = 〈rStructureγ, vSetγ〉 , γ ∈ Γ (5.18)

where vSetγ is the set of generalised violations:

vSetγ{θRr (violationm) : violationm ∈ damageR(repairβ)} (5.19)

where repairβ was the repair that was used to solve the original violation.

Figure 5.10 gives the new adaptation algorithm, much of which is the same as the

earlier version. Although the candidate repairs produced by GenerateCandidates

are of the same type as the generalised repairs from the cases in the trimmed case-

base, the violations that they cause are often very different. The violations caused by

each candidate repair must be compared with the violations that were caused by the

generalised repairs it was generated from. In order to do this the candidate repairs

are themselves generalised using the θRr function given in (5.8).

A comparison between two sets of newly created violations must take into account

two complicating factors. First, the number of constraint violations caused by two

different repairs is likely to be different. Furthermore, even if the number of violations

is the same, their types are very likely to be different. Hence the direct measurement

of the difference between the sets of generated violations needs to be tackled carefully.

Given CBTEMP - the trimmed set of cases;vDistance - the set of violation distances;k - a search parameter;violationα - the focus violation:

1: Candidates ← ∅2: i ← 03: while |Candidates| = 0 do4: casei = 〈vi, ri〉 ← CBTEMP[i], where ri = 〈rStructi, vSeti〉5: Candidates ← Candidates ∪ GenerateCandidates (violationα, rStructi)6: i ← i + 1.7: if i = k then break8: end while9: if |Candidates| = 0 then return false

10: generate an array rDistance[|Candidates|]11: for all 〈repairβ, casei〉 ∈ Candidates do12: rβ = 〈rStructβ, vSetβ〉 ← θR

r (repairβ)13: rDistance[β] ← vDistance[i]× (1 + dr(vSeti, vSetβ))14: end for15: return Candidates sorted according to rDistance

Figure 5.10: The alternative adaptation algorithm

The distance between two sets of violations vSeti and vSetβ is determined using the

following measure:

dr(vSeti, vSetβ) =1

[Ji−1∑j=0

D(vi,j, vSetβ) + (Jβ − h)

](5.20)

where Ji = |vSeti| and Jβ = |vSetβ| are the number of violations in each set, h is

the number of violations in vSeti whose structural information is the same as that of

a violation in vSetβ, c = Ji + Jβ − h is a normalising variable, and

D(vi,j, vSetβ) =

dv(vi,j ,vβ,µ)

dMaxif ∃ vβ,µ ∈ vSetβ s.t. vStructi,j = vStructβ,µ

1 otherwise

(5.21)

where dMax =√

dim(vi,j) is the maximum possible distance between any two

feature vectors of the same type. Dividing the distance in this way ensures that the

function D always returns values in the interval [0, 1].

Formula 5.20 finds the distance between two sets of violations by summing the

distances between each of the violations in the first set from the whole of the second

set. For each violation in the first set the distance is calculated as being maximal

(i.e. equal to 1) if there are no violations in the second set with the same structural

information. Otherwise, the distance for the violation is calculated as the distance

from the matching violation within the second set. Every time a matching violation

is found in the second set the value h is incremented. After all the violations in the

first set have been considered the sum is increased by the number of violations in

the second set which have not been compared. The normalisation is performed by

dividing by the total number of comparisons that have been made.

The overall distance measure (line 13 of Figure 5.10) is the product of the violation

distance, calculated in the retrieval stage and stored in the vDistance array, and the

repair distance, which is calculated using Formula (5.20) as the distance between

the two sets of violations, vSeti and vSetβ. The value 1 is added to the repair

distance in order to ensure that two identical repairs of two non-identical violations

do not have a zero distance between them. However, two repairs generated for two

identical violations should always have zero distance between them. The definition

of this measure ensures that when the candidate repairs are sorted at the end of the

adaptation algorithm the distances between the focus violation and the violations

from the retrieved cases are also reflected in the rankings.

5.7 Case-base Training

In order to successfully train a case-base care must be taken. It is important that the

case-base contains a range of different experiences - representing a large number of dif-

ferent possible rostering scenarios. In particular, it is vital that there are cases in the

case-base which represent good examples of behaviour in contrasting circumstances.

With this in mind, an attempt is made in this section to give a set of guidelines for

the construction of a case-base.

Case-bases should be trained on real world data rather than on artificially created

problems. It is difficult to artificially create the complexity found in real problems in

such a way that will be useful for future problem solving. The real world problems

should include a wide range of different constraint violations. The (expert) user

should select a representative cross-section of these before training commences.

For each violation that the user chooses to include in the training some key points

must be considered:

1. The user must provide a repair to the given violation, making sure that it

takes into account the context that the violation occurs in. In particular, any

violations which are caused by the repair should be thought about in terms of

the user’s overall goals for the roster.

2. If other repairs could also be used to address the violation then they could

also be added into the case-base. This provides the case-base with a range of

experience.

3. The user should consider what they would have done if circumstances had been

slightly different. For example, if no nurses on the day of the violation had unas-

signed shifts then the possible alternative repair would have to be considered.

By manually altering the roster the user can generate an alternative scenario

for which a significantly different repair should be used. By including these

examples of similar violations which require different repairs the case-base is

trained to recognise the border regions in the decision space.

4. It is useful to also provide repairs to any violations created by the original repair.

This teaches the case-base how to solve a series of violations.

It is difficult to set a figure to the number of cases that should be stored in the

case-base. One advantage of the CABAROST method is that after representatives

of each violation type have been stored in the case-base the algorithm can provide

suggested solutions to the user during training. In this way the user can develop an

idea of when the case-base has been adequately trained by monitoring the number of

suggestions it made which they accepted without change.

Training of the case-base can also be considered as an on-going process. In au-

tomated mode the CABAROST system can detect if the repair distance is above a

certain threshold. This threshold can be set so as to represent the point at which

CABAROST can not be confident in the decisions it is making. By setting this

threshold to a small value the user can indicate that they want to supervise the deci-

sion making closely and change any repairs that they disagree with. When the user is

confident that the decisions being made are correct then the threshold can be set at

a high value - so that they are required to intervene in extreme circumstances only.

5.8 Performance

The results presented in this section illustrate how the method performs at imitat-

ing the rostering decisions of humans. We do not compare performance with other

rostering methods for two reasons. Firstly, the amount of information used and the

way it is presented is incompatible with most existing problem formulations. More

significantly, the case-based reasoning method here treats constraint violations as in-

dividual problems rather than providing solutions to entire rostering problems in the

traditional sense.

The method has been implemented and tested on real-world data from the QMC.

This data consisted of seven 28-day rosters and the corresponding preference infor-

mation for 19 nurses of various qualification and training levels. In this experiment

two constraint types were used, namely Cover constraints and MaxHours constraints.

The actual constraints defined are:

4. Cover: EARLY shifts require 1 Auxiliary Nurse

8. Cover: LATE shifts require 1 Auxiliary Nurse

11. Cover: NIGHT shifts require 1 Auxiliary Nurse

12. MaxHours: The maximum number of hours any nurse may work in a fortnight

(14 days) is 75

The aim of the experiment was to determine the quality of the reasoning process in

terms of the agreement between automated decisions and those of the nurse rostering

expert. Constraint violations were identified at random and the repairs suggested by

this method were compared to the repairs actually made in the final roster. These

expert repairs were determined by comparing the final and preference rosters. The

‘quality’ of a generated repair was assessed by comparing it with the expert repair

and assigning one the following verdicts:

Exact Match: The generated repair is identical to the expert’s repair.

Equivalent Match: The generated repair involves nurses of the same types and the

same shifts as those used in the expert’s repair.

Fail: The generated repair is not an exact or equivalent match, or no repair was

generated.

The experiment was run 5 times and 120 constraint violations were repaired during

each run. For each violation three repairs were suggested by the method ranked

according to the repair distance and compared to the expert repair. The case-base

was empty at the start of each run and the expert repair for each of the constraint

violations was stored after it was applied to the roster. In this way CABAROST was

storing more experience in the case-base as the run progressed. Figure 5.11 shows

the average cumulative number of exact and equivalent matches against the case-base

size for each of the three suggested repairs. The bold lines are the first (or best with

respect to the reasoning process) repairs for each iteration.

The results in Figure 5.11 show an increasing gradient of all lines indicating an

increasing number of repairs of the given verdict per iteration. It can be seen from this

that the case-base learns how to produce more exact or equivalent repairs as its size

increases. An increase in the amount of training given to the case-base corresponds

to an increase in the quality of the repairs produced. It is particularly encouraging

that the first suggestions in general score more exact and equivalent matches than

the second and third. The increases in solution quality are made more apparent in

Figure 5.12. This shows the percentage of exact and equivalent repairs at different

stages in the runs. In general, in the later stages, when the case-base contains more

experience, a larger number of good suggestions are produced.

Figure 5.11: Average cumulative number of exact and equivalent matches againstcase-base size over five 120 iteration runs.

Figure 5.12: Effects of case-base size on solution quality.

5.9 Conclusion

This chapter has introduced a mechanism for repairing constraint violations within

personnel rostering problems using case-based reasoning. This method is used to

capture rostering experience on a case-by-case basis by recording the way in which

human rostering experts perform repairs. This rostering knowledge is used to solve

future problems by retrieval and adaptation methods which can successfully imitate

the behaviour of the expert who provided the training.

The problem of data generalisation has been addressed to allow experience that

has been stored in the case-base to be applied to future, perhaps significantly different,

problem instances. Information about the nurses involved in both the constraints vio-

lations and the corresponding repairs is non-specific. This increased the applicability,

and therefore usefulness, of the experience stored in the case-base.

The similarity between constraint violations has been described by structural and

statistical feature information. During retrieval cases are selected which are identical

to the current problem in terms of the structural information and closest in terms of

the distance between the statistical feature vectors.

Two methods for adapting the repair information from cases have been described,

both using notions of structural similarity. Both methods initially generate sets of

structurally identical repairs which can be applied to the current problem. The first

method ranks these according to statistical features using the same distance measure

used to rank violations in the retrieval phase. The second approach considers the new

violations which are caused by the repairs which are generated. A distance measure

based on the distance between these new violations was described.

A simple experiment was used to verify that the CABAROST algorithm is capable

of learning rostering experience from experts. The experiment also showed that the

case-based method can produce the same, or similar, repairs with reliable accuracy.

In the following chapters further experimentation will be used to demonstrate the

effectiveness of the method.

Chapter 6

Violation Features and Weighting

6.1 Introduction

In CABAROST, cases are retrieved from the case-base using a two stage retrieval

phase. The first stage retrieves those cases containing violations of the same type

as the current problem. The second stage calculates the similarity of these cases to

the current problem using the weighted nearest neighbour method. The violations

are represented by a set of characteristic features and can be interpreted as points

in a feature space. Weights are assigned to the features representing their relative

importance. The most similar case is then defined as the one with the smallest

weighted distance from the feature vector representing the current problem. It is vital

for the retrieval phase that appropriate features are selected to represent the violations

and that these features are carefully weighted. In this chapter an automated feature

weighting and selection algorithm will be described.

One of the most common ways to determine the accuracy of a case-base is to

measure its classification accuracy [80, 103, 131]. The CABAROST method can be

viewed as a classifier which determines the type and parameters of a repair for a

given violation. Its classification accuracy can be measured by repeatedly removing

a case from the case-base, performing a retrieval to determine the nearest case to the

removed case, and then comparing the repairs in the removed and the retrieved case.

In the literature, nearest neighbour classification algorithms [87] have been used

successfully to solve a number of different classification problems. They allow com-

6. violation features and weighting 138

plex relationships between input parameters to be captured without the need to model

them explicitly [113]. However, they can be sensitive to noise in the data sets and

erroneous or irrelevant features [6]. These effects can be reduced by selecting only rel-

evant features from the feature set and assigning a weight to each feature representing

its relative importance [232].

A number of different feature weighting and selection methods have been de-

veloped including Salzberg’s [201] feature weighting algorithm based on a heuristic

approach for his EACH classification method, a random mutation hill climbing ap-

proach for feature selection by Skalak [212], and a genetic algorithm by Kuncheva

and Jain [139]. Many more feature selection and weighting algorithms are described

in reviews by Wettschereck et al. [246, 247]. In this chapter we investigate an ap-

proach to automated weighting and feature selection based on the genetic algorithm

GA-WKNN developed by Kelly and Davis [127] and a dimensionality reduction algo-

rithm developed by Raymer et al. [196]. These approaches are adapted so that they

can handle the types of data used in the CABAROST method to model the nurse

rostering problem.

In this chapter an adaptation of a feature weighting and selection algorithm to a

complex real life nurse rostering problem will be presented. This algorithm allows us

to learn which features are important when making rostering decisions and which fea-

tures are irrelevant, thus increasing our understanding of the nurse rostering problem.

The accuracy of the CABAROST method is increased by weighting the features and

the search time is decreased by reducing the number of features that it is necessary to

store in each case. Furthermore, the flexibility and adaptability of the case-based ap-

proach is enhanced because its behaviour can be tuned more precisely to the decision

making style of the expert who trained it. The training data used for the experiments

in this chapter has been derived from the QMC rosters.

This chapter is organised as follows. Section 6.2 describes the method for measur-

ing the classification accuracy of the CABAROST algorithm. Section 6.3 introduces

the different types of features used to describe the generalised violations in the case-

base. The genetic algorithm for feature weighting and selection is presented in Section

6.4. The results obtained by applying the algorithm to a case-base of rostering deci-

Given case-base CB,initialise total := 0initialise correct := 0for all caseL ∈ CB

total := total + 1remove caseL from CBretrieve caseγ ∈ CB most similar to caseL

if repair′γ = repair′L thencorrect := correct + 1

end ifrestore caseL to CB

end forreturn (100× correct/total)

Figure 6.1: Pseudo-code for the algorithm for measuring classification accuracy

sions are presented in Section 6.5.

6.2 Measuring Classification Accuracy

One of the most common ways to assess the performance of a CBR system is to mea-

sure its classification accuracy [247]. Although the CABAROST method is more than

simply a classification algorithm, this measure is still very important as it indicates

how well CABAROST identifies which type of repair to generate for a given viola-

tion. It is important that the method correctly classifies violations as this increases

the likelihood that the adaptation method will generate appropriate repairs.

The classification accuracy of a case-base is measured by performing retrievals on

the cases it contains (see Figure 6.1). A leave-one-out cross-validation strategy is

used whereby a case is removed from the case-base and then passed to the retrieval

algorithm as the focus case. The retrieval algorithm returns the most similar (nearest)

case to the focus case. If the retrieved case and the focus case contain repairs with the

same structural information then the focus case is deemed to be correctly classified.

This operation is performed on every case in the case-base and the classification

accuracy is measured as the percentage of correctly classified cases.

An additional measure of classification accuracy is performed on the data in the

case-base. This involves a random retrieval of cases from the case-base for each focus

case. Instead of using the retrieval algorithm a random case is selected and compared

to each focus case. The percentage of correctly classified cases is calculated and

referred to as the random classification accuracy (RCA) of the case-base. It is useful

to calculate this value to ensure that the CBR mechanism is doing more than just

randomly selecting cases. Therefore it is important to ensure that the classification

accuracy of a case-base is significantly higher than its random classification accuracy.

6.3 Violation Features

The first stage of the retrieval phase chooses cases that are structurally the same

as the focus violation. A large number of such cases can exist within a case-base

and therefore it is necessary to rank them according to their violation features. The

violation features are statistical characteristics of the roster and the violation. They

can be seen as a ‘snap-shot’ of the state of the roster at the time the violation was

repaired. They are considered to be important when making rostering decisions but

their exact relationships with each other and with the decisions that are made are

not known a priori. Nearest neighbour similarity measures such as those employed

by case-based reasoning allow these relationships to be captured without the need for

explicit representation [247].

In order to represent sufficient information about a roster and its violations a

number of different types of data are used for the features. Four different types of

features are used here: Real-Valued, Shift Pattern, On-Off Pattern, and Cover Array.

The inclusion of features that are not real or integer valued also necessitates the

definition of alternative difference measures.

6.3.1 Real-Valued Features

The real-valued features are statistical measurements of the roster and violation.

They are normalised to the interval [0,1]. An example is the percentage of nurse

preferences satisfied. The distance measure used for real-valued features is:

f(a, b) = |b− a| (6.1)

where a, b ∈ [0, 1].

6.3.2 Shift and On-Off Pattern Features

A significant amount of the information needed to make rostering decisions is present

in the working patterns of the nurses involved. Shift pattern features are arrays

representing the shifts that an individual nurse works over a specific number of days.

On-off pattern arrays are similar but only represent whether or not an individual

nurse is working any shift on each day. The number of days can vary depending on

the type of constraint violation that is being recorded in the case-base.

The distance between patterns is calculated as the total number of positions in

which the shift is different and is normalised by dividing by the pattern length. Given

a = [a0, a1, . . . , aP−1] and b = [b0, b1, . . . , bP−1],

f(a,b) =1

P−1∑i=0

δ(ai, bi) (6.2)

δ(ai, bi) =

{0 ai = bi

1 ai 6= bi

It may be unclear why both shift patterns and on-off patterns are defined as

separate feature types. Shift patterns contain more information than on-off patterns

but could potentially cloud the similarity measure if the extra detail is not relevant.

Hence both types are included with the intention that the genetic algorithm will

decide whether either or both types are useful to the retrieval phase.

6.3.3 Cover Array Features

Cover arrays provide a data structure for storing information about the different types

of nurse working a particular shift on a single day, or over a number of days. There

are two types of cover array: ordinary cover arrays and average cover arrays.

An ordinary cover array counts the number of nurses of different qualifications or

specialty training working on a particular shift. The ordinary cover array x is defined

as an array of integers:

x = [[x0,0, x0,1, . . . , x0,S−1], [x1,0, x1,1, . . . , x1,S−1], . . . , [xD−1,0, xD−1,1, . . . , xD−1,S−1]]

where S is the number of different types of nurse and D is the number of days. For

cover arrays measuring the number of nurses based on qualification level S will be

equal to 7 (i.e. XN, SN, PN, AN, QN, EN, RN) and based on skill level S will equal

2 (ET or NT). The number of days that a cover array is measured over depends

on the type of constraint violation for which it is being used. Cover arrays over

several days are useful for storing shift pattern information around the day in which

a violation occurs (for violations such as Cover, Succession, and MinDaysOn). They

store information which indicates the possibility of different repair types that could

take place around the violation.

An average cover array records the average of the number of nurses of each type

over a specified period. It takes the same format as the ordinary cover array, but

with D set to 1 and with real-valued elements.

The distance measure for cover arrays takes into consideration the difference be-

tween each corresponding pair of elements. Given cover arrays a and b the difference

function is defined as follows:

f(a,b) =1

D−1∑

S−1∑s=0

|bd,s − ad,s| (6.5)

When designing features for the nurse rostering problem it became evident that

the definition of similarity of violations must include information about the scope

for repair. For example, if the EARLY shift on a particular day is not sufficiently

covered then information about the number of nurses of each type who are currently

UNASSIGNED, or OFF, indicates how ‘easy’ it will be to solve the violation. The

number of available nurses of each type is very relevant to decision making and should

be captured by the case-base. Cover arrays provide a way in which to record shift

pattern information about a large number of nurses.

This can be illustrated with several examples. Consider the following seven day

roster in Table 6.1.

Table 6.1: An example roster

Nurse NurseType 0 1 2 3 4 5 6nurse0 〈RN,M, H, ET,E〉 E L E E L U Unurse1 〈RN,F,H, NT, D〉 N N U U E L Enurse2 〈EN, F, H,ET, D〉 U U N N U U Enurse3 〈AN,F,H, NT,B〉 E L U U E L U

The ordinary cover array that counts the nurses based on their qualification (XN,

SN, PN, AN, QN, EN, or RN) for the EARLY shift on day 4 is:

[[2, 0, 2, 1, 1, 0, 1]] (6.6)

indicating that there are 2 nurses (of any qualification), 0 student nurses, 2 employed

nurses, 1 auxiliary nurse, 1 qualified nurse, and 1 registered nurse working the early

shift on day 4.

The cover array for unassigned shifts over the three day period from day 3 to day

[[2, 0, 2, 1, 1, 0, 1], [1, 0, 1, 0, 1, 1, 0], [2, 0, 2, 0, 2, 0, 2]] (6.7)

Finally, the average skill mix for the unassigned shift over the seven day period

based on specialty training is:

[[1.57, 0.00, 1.57, 0.14, 1.14, 0.57, 0.57]] (6.8)

indicating that there is an average of 1.57 nurses with unassigned shifts over the

period, 0 student nurses, 1.57 employed nurses, 0.14 auxiliary nurses, and so on.

6.3.4 Features

The set of constraint violation features used for each violation type has grown through-

out the development of the CABAROST method. Features have been added and re-

fined through experimentation and contact with the experts at the QMC. One of the

goals of this research is to discover which aspects of the roster are important when

making rostering decisions by selecting the ‘best’ of the large number of violation

features. The genetic algorithm described in this chapter will reduce the number of

features (perhaps significantly) and thus increase both the quality and efficiency of

the CABAROST method.

Table 6.2 describes the features used for five violation type. The first column

contains the feature descriptions, and the second, third, and fourth columns describe

the nurses, shifts, and period (number of days), respectively, over which each feature

is measured. The fifth and sixth columns give the abbreviation for each violation and

the data type used to represent them. The final columns indicate the violations for

which the features are used.

Features can be measured over different combinations of nurses, shifts, and pe-

riods. The second column in Table 6.2 shows which nurses are considered for each

feature. Features that measure over all of the nurses in the roster have ‘All’ in this

column. The features who have ‘NurseType’ in this column consider only those

nurses in the roster with type information that matches the NurseType variable of

the violation. For example, if the violation describes a problem involving an enrolled

nurse then the associated features will measure their statistics for all enrolled nurses

in the roster. When ‘Specific’ appears in this column it indicates that the feature

considers only the specific nurse involved in the violation.

The first six features in the table are used by all of the violation types. They

are general, real-valued statistics measured over the entire rostering period. They

describe the overall state of the roster at the time the violation was repaired. The

next four features are unique to cover violations and describe the number of hours

that have been assigned to nurses on the day that the violation of the cover constraint

took place. The final six real-valued features are used by all of the violation types

except cover violations. They all measure the percentage of contract hours that are

currently allocated within the period of the violation (the number of days over which

the violation takes place) and also outside this period. These features are measured

for all nurses, the specific nurse involved in the violation, and nurses of the same type

as the nurse involved in the violation.

The shift pattern feature is a feature of all of the violations except cover violations.

It records the shift assignments of a particular nurse and is not used for violations

of cover constraints as these do not involve a single identifiable nurse. Similarly, the

on-off pattern feature is not used to describe cover violations or succession violations.

Succession violations always involve a two day period, both of which will always be

assigned as ‘on’ shifts for the nurse, and consequently on-off pattern features are not

useful.

Eight features which use ordinary cover arrays to represent their values are used for

cover violations, four of which count the nurses who have OFF shifts and four which

count UNASSIGNED shifts. They measure cover on both the single day that the cover

violation takes place and for five days around the violation (two days either side),

for nurse qualification and specialty training. MaxDaysOn and MaxHours violations

take place over larger periods (e.g. 14 days) and so average cover arrays are used.

Four of these are defined, counting nurses who have OFF and UNASSIGNED shifts

for both qualification and specialty training. Six average cover arrays are used for

MinDaysOn violations, measuring cover around the period of the violation (i.e. over

the period and one day either side) for EARLY, LATE, and NIGHT shifts. These are

summarised in two lines on the table for brevity. This is also the case for the twelve

ordinary cover array features used by the Succession violation. These are measured

on both days of the violation separately for the OFF shifts and UNASSIGNED shifts

as well as for the specific shift that the nurse had on the day.

A large number of features have been described here. These features are predicted

to have an impact on rostering decisions. The feature selection algorithm aims to

reduce this number and identify the features that are most relevant for the nurse

rostering problem.

• •

ific/U/O

ecific/

Table 6.2: The features used to represent constraint violations in the case-base

6.4 Genetic Algorithm for Feature Weighting and Selection

The nearest neighbour distance function which is used in the retrieval phase requires

a good selection of features and an appropriate set of feature weights. The effect of

an increase in the weight of a particular feature is an increase in the influence that

the feature has on the selection process. By decreasing their weighting, irrelevant

features exert less influence on the calculation of the distance between cases, thus

increasing the accuracy of the system.

It is not always the case that a set of equal-valued feature weights is appropriate

for any given case-base. The definition of a good set of feature weights is not a trivial

problem. Manual selection of weights is a difficult task and can introduce unwanted

bias to the classification process. The automated feature weighting algorithm used

here is an adaptation of the GA-WKNN algorithm proposed by Kelly and Davis [127].

This method uses genetic algorithms to find a good set of feature weights with respect

to the classification accuracy of the case-base.

Feature selection is effectively an extension of feature weighting. Feature weights

set at zero (or close to zero) represent a deselection of the corresponding features.

Feature selection introduces the concept of an optimal subset of features as an ex-

tension of the optimal set of feature weights. One of the major advantages of feature

selection is the reduction in the dimensionality of the feature space. By selecting

only those features that are most relevant to the problem the storage requirements of

the case-base are reduced and the speed with which cases are retrieved is increased.

The feature selection algorithm presented here works alongside the feature weighting

algorithm, in a combination similar to that first suggested by Raymer et. al. in [196].

Genetic algorithms are optimisation tools based on the concepts of natural evo-

lution. A population of solutions is manipulated through a number of generations

according to the principals of natural selection. The solutions are represented as

chromosomes which can be combined to produce offspring through crossover opera-

tions. Crossover operations use information from two or more parent chromosomes

to generate new chromosomes. Members of the population can be altered between

generations by applying local mutations to chromosomes. Members of a current pop-

ulation are selected for crossover and mutation according to their fitness - a measure

of the quality of the solution that they represent. Members with higher fitness are

more likely to be selected than those with lower fitness and are therefore more likely

to pass good solution information to the next generation. There is a large body of

literature dedicated to the theory and practice of genetic algorithms. Some key texts

are [101, 112, 165].

In this problem chromosomes are vectors of real and binary values. The vector

can be split into four sections. The first section contains the real valued weights for

each of the features. The remaining three sections represent binary feature selection

variables - three for each feature. A voting system is used to determine if a feature is

selected. At least two out of the three selection variables must have value 1 for the

associated weighted feature to be used in the distance function given in Formula 5.9,

otherwise the feature’s weight is set to 0 and the feature is excluded (deselected) from

the distance function. The use of three sets of selection variables effectively smooths

out the fitness landscape by reducing the impact of a change in the selection status of

a feature on a chromosomes fitness [196]. Given violation feature vectors of dimension

F , a member of a population is represented as:

m = {w0, . . . , wF−1︸︷︷︸, f0,0, . . . , f0,F−1, f1,0, . . . , f1,F−1, f2,0, . . . , f2,F−1︸︷︷︸}feature weights feature selection sets

One-point crossover [31] was used to combine two parents with a 0.6 probability.

This operation chooses a position on two chromosomes at random and then creates

two children by swapping the parent’s values to the right of the selected position.

Given two children x = {x0, x1, . . . , xZ−1} and y = {y0, y1, . . . , yZ−1}, and crossover

point c ∈ [0, Z − 1], where Z = 4 × F (the number of features), the two children

produced by a crossover of x and y are:

x’ = {x0, . . . , xc, yc+1, . . . , yZ−1} and y’ = {y0, . . . , yc, xc+1, . . . , xZ−1} (6.10)

Parents were selected for crossover using a roulette wheel selection process [165].

In this method, individuals are selected with a probability equal to the proportion

that their calculated fitness contributes to the sum of the fitness of all members of

the population. Chromosomes are mutated by adding or subtracting a set amount

from one of their weights, or by performing a NOT operation on a feature selection

variable. Children undergo mutation with a 0.7 probability. Finally, an elitist strategy

is used whereby the chromosome with the highest fitness is always passed on, without

mutation, to the next generation.

The fitness of an individual chromosome is calculated by first converting it into a

standard weight vector by setting those weights to 0 whose feature selection variables

vote for deselection. The vector is then normalised so that the total sum of weights is

equal to one. This weight vector is then used in the weighted nearest neighbourhood

function and the classification accuracy of the case-base is calculated.

The initial population is filled with randomly generated vectors. The feature

selection variables are set randomly with no bias placed on how many features are

selected. The feature weights of all members of the population vectors are normalised

before the mutation and crossover operators are applied.

The feature set used for each type of violation is different and so the algorithm

must be run separately for each. The result of this is that each different violation type

will have a different set of feature weights and a different subset of selected features.

There are clear advantages to treating violations of different types separately in the

case-base. It is intuitive that different information is needed to make decisions about

different types of problems, and the separation of weighting and feature selection

allows the appropriate emphasis to be placed on the relevant data.

6.5 Results

The algorithm was used to select features and feature weights based on a case-base

trained using the expert rostering knowledge of nurses at the QMC. It was trained

over two months on rosters involving 12 different constraints:

9. MaxDaysOn: The maximum number of consecutive shifts for nurses of any type

10. MaxHours: The maximum number of hours any nurse may work in a fortnight

(14 days) is 75

11. MinDaysOn: The minimum number of consecutive shifts for nurses of any type

12. Succession: An EARLY shift must not follow a NIGHT shift

This case-base contained 237 cases representing different numbers of each violation

type: 97 Cover, 29 MaxDaysOn, 34 MaxHours, 48 MinDaysOn, and 29 Succession.

The comparatively large number of cover violations reflects their prevalence within

the rosters used to train the case-base.

In this section three different approaches to feature selection and weighting are

evaluated by measuring the classification accuracy of the CABAROST method. Fea-

ture selections and weights generated by the genetic algorithm and by a simple local

search algorithm are compared to a flat (all equal) weighting of the nearest neigh-

bour function with all features selected. The local search method applies a simple

greedy heuristic and includes no mechanism for avoiding local optima. It uses two

neighbourhood definitions. The first generates neighbouring vectors by switching the

selection status of each feature in turn. The second chooses a feature at random and

Table 6.3: Classification accuracy (with full initial feature set)

Violation Type RCA CB-1 CB-LS CB-GA CB-GA+LSCover 47.42 (6.98) 70.10 75.72 (3.69) 80.88 (1.48) 80.88 (1.48)

MaxDaysOn 18.97 (5.92) 58.62 73.10 (4.82) 78.28 (2.98) 78.92 (2.64)

MaxHours 18.46 (11.27) 23.53 48.24 (5.68) 52.65 (1.88) 53.15 (1.76)

MinDaysOn 21.88 (8.35) 83.33 95.10 (3.72) 97.60 (1.02) 97.60 (1.02)

Succession 25.07 (8.23) 55.16 85.17 (4.67) 90.52 (2.47) 90.52 (2.47)

OVERALL: 31.88 62.87 76.54 81.08 81.23

generates neighbouring vectors by setting the weight of the feature to all values be-

tween 0 and 1 with a discretisation step of 0.1. At every iteration the best neighbour

is selected from these two neighbourhoods. The search stops when no improvement

is made for a number of iterations which was determined based on the size of the two

neighbourhoods. The local search method is also applied to the GA generated weight

vectors to ensure that these have been locally optimised.

The genetic algorithm was run 20 times on the case-base, each time for 60 gen-

erations with a population size of 60. Table 6.3 shows the average (and standard

deviation) classification accuracy of the case-base for each type of violation using the

different weighting approaches. The bottom row gives the overall classification accu-

racy for the whole case-base (i.e. taking into account all violation types). The RCA

and CB-1 columns show the random classification accuracy of the case-base (calcu-

lated 20 times) and the classification accuracy using flat weights, respectively. The

results for the local search (CB-LS) and genetic algorithm (CB-GA) are presented

in the fourth and fifth columns. The final column shows the results for the genetic

algorithm followed by the local search (CB-GA+LS).

It is clear from the results that the overall classification accuracy of the CABAROST

method is significantly increased by applying the feature weighting and selection al-

gorithms. The results also verify that the flat weighted nearest neighbour similarity

measure performs significantly better than a random selection strategy. The genetic

algorithm produces consistently better weight vectors than the local search method

due to its ability to escape bad local optima. The weight vectors produced by the

Table 6.4: The average number of selected features (with full initial feature set)

Violation Type CB-1 CB-LS CB-GA CB-GA+LSCover 18 3.4 4.1 4.1

MaxDaysOn 18 2.8 2.9 2.9MaxHours 18 3.6 3.2 3.2

MinDaysOn 20 3.1 3.7 3.7Succession 25 2.6 3.1 3.1

local search method vary more considerably and this is evident in the larger stan-

dard deviations in the results produced. Furthermore, applying local search to the

final solutions produced by the genetic algorithm did not increase performance in the

vast majority of instances. This indicates that the genetic algorithm is capable of

converging to good local optima.

It should be noted that the classification accuracies for MaxHours constraint vi-

olations are significantly lower than for other constraint types. MaxHours violations

usually occur over larger periods than other violations - typically 14 days. This low

classification accuracy is a reflection of how difficult it is for the nearest neighbour

distance measure to differentiate between different cases when data about so many

days must be considered.

Table 6.4 shows the average number of features selected in the best solutions of

each weighting algorithm. The random classification method does not use the feature

sets and so is not included in the table. The results are shown for the flat weighting

(i.e. the original feature set), the local search method, the genetic algorithm, and the

genetic algorithm followed by local search.

The feature weighting and selection algorithms all found good solutions with sig-

nificantly smaller sets of selected features. The genetic algorithm selected slightly

more features than the local search. This is probably due to the fact that if the local

search deselects features very early on in the search it may be unable to re-select

them without first reducing the overall quality of the weight vector. Its inability to

escape local optima prohibits it from performing this kind of optimisation. When

applied to the final solutions of the genetic algorithm the local search did not change

the number of selected features in any instance. This re-enforces the conclusion that

the genetic algorithm is capable of converging to local optima in the search space.

Each run took approximately half an hour on a machine with an Intel Pentium 4 -

3.0GHz processor. This large run time was in part due to the large number of features

used for each type of violation. The most computationally expensive operation of

the algorithm is the calculation of chromosome fitness. One goal of this work is to

reduce the number of features that are stored in cases. This will reduce the runtime

to operationally feasible levels and potentially allow the algorithm to achieve even

better classification accuracy by re-weighting and selecting from the relevant subset.

In order to test this hypothesis a further round of experiments was performed on a

refined subset of the features.

The refined sets of features were determined using the results of the initial exper-

iments. A simple selection heuristic was applied based on two conditions. A feature

was retained in the refined set if one of the following conditions was true:

1. It was selected in at least half of the ten best performing solutions produced by

the genetic algorithm over the 20 runs.

2. It was selected in the best solution produced by the genetic algorithm over the

20 runs.

A new case-base was then created with cases which included only these features in

their violation feature vectors. The feature weighting and selection algorithms were

then applied to the new case-base. The genetic algorithm was then run 20 times for

50 generations and with a population size of 50. Each run of the algorithm took

less than 5 minutes. Table 6.5 shows the classification accuracy of the CABAROST

method using flat weights, local search, the genetic algorithm, and the genetic algo-

rithm followed by the local search. The average number of features selected by each

algorithm is presented in Table 6.6.

The genetic algorithm outperforms the other weighting methods for all of the

violation types. The solutions it produced from the refined set of features are signifi-

cantly better than the solutions it produced using the larger set, with greatly reduced

Table 6.5: Classification accuracy (with refined initial feature set)

Violation Type CB-1 CB-LS CB-GA CB-GA+LSCover 78.35 79.43 (4.20) 81.96 (0.85) 81.96 (0.85)

MaxDaysOn 72.41 78.44 (3.33) 82.06 (1.80) 82.06 (1.80)

MaxHours 50.00 53.09 (1.50) 54.12 (1.48) 54.12 (1.48)

MinDaysOn 91.67 97.81 (0.47) 97.92 (0.00) 97.92 (0.00)

Succession 86.21 89.83 (2.62) 93.10 (0.00) 93.10 (0.00)

OVERALL: 77.22 80.53 82.57 82.57

Table 6.6: Average number of selected features (with refined initial feature set)

Violation Type CB-1 CB-LS CB-GA CB-GA+LSCover 5 3.0 3.8 3.8

MaxDaysOn 8 3.3 3.5 3.5MaxHours 4 2.6 2.2 2.2

MinDaysOn 11 3.0 4.1 4.1Succession 9 3.4 3.7 3.7

run-times. The classification accuracy of the CABAROST method using flat weights

is also improved for the refined feature set. The local search is again unable to find

solutions of as high a quality as the genetic algorithm although the difference between

them is not as large. Applying the local search after the genetic algorithm lead to no

improvement in fitness for any violation type. Although the reduction in the numbers

of features selected was less significant for this experiment it is of interest that only

around half of the features in the refined sets were selected in the best solutions.

Figures 6.2 to 6.6 show the best set of weights discovered by the genetic algorithm

for each violation type. For each violation type the following observations are made:

Cover. The 20 solutions produced by the genetic algorithm for the large initial fea-

ture set displayed a high degree of homogeneity and consequently the refined

feature set chosen by the feature selection conditions was relatively small. In-

deed the best solutions of the 20 runs on the refined set selected all five features.

The largest weight was given to W-G-Ass, the percentage of assigned hours for

all nurses over the whole roster. This could indicate that the order in which

violations are repaired is being captured by the case-base. Early in the rostering

Figure 6.2: Feature weights for Cover Violations

Figure 6.3: Feature weights for MaxDaysOn Violations

Figure 6.4: Feature weights for MaxHours Violations

Figure 6.5: Feature weights for MinDaysOn Violations

Figure 6.6: Feature weights for Succession Violations

process, when fewer nurses have assigned shifts, cover violations are more easily

solved using highly qualified nurses. This is not the case later in the process

when less qualified nurses must be assigned due to previous rostering actions.

Information about the qualifications of nurses who have UNASSIGNED shifts

on the day of the violation (D-U-QCA), and around the violation (5-U-QCA)

is also important when repairing cover violations.

MaxDaysOn. A further ‘refinement’ of the features was carried out by the genetic

algorithm during the second round of experiments. Four of the features which

were included in the refined set by the selection conditions were deselected in the

best solution produced during the second 20 runs. The most important features

according to the weighting were O-L-Ass, the percentage of hours assigned to

nurses of the same type as the nurse involved in the violation on days outside

the period of the violation, and P-U-SCA, the specialty training cover array

over the period of the violation counting nurses who have the UNASSIGNED

shift. This shows that the level of shift assignment both inside and outside

the violation period is used to make repairs of MaxDaysOn violations. It is

also evident that the specialty training of other nurses during the period of the

violation must be considered. This reflects the fact that swap operations are

frequently used to repair these violations.

MaxHours. Of the four features included in the refined set three were selected by

the second round of experiments. The most important of these was O-N-Ass,

which measures the percentage of assigned hours outside of the period of the

violation. It is difficult to explain this result and the low classification accuracy

for Totals violations indicates that more relevant features must be designed.

MinDaysOn. A large number of features were included in the refined set by the

selection conditions. However, in the best solution of the second round of ex-

periments only two of these were selected. P-N-SPat received the larger weight,

showing that the shift pattern that the nurse has over the violation period is

very important when choosing which repair to use.

Succession. Again, only two features from the refined set were selected in the best

solution over the second 20 runs of the genetic algorithm. The most important

feature describing the Succession violation is the specialty training cover array

measured over four days around the second day of the violation counting nurses

who have the UNASSIGNED shift (4-U-SCA-2). The bad succession described

by the violations in the case-base was NIGHT-EARLY. The weighting shows

that most repairs are made by considering the possibilities for swapping the

EARLY shift to another nurse (i.e. one with an UNASSIGNED shift if possible)

and leaving the NIGHT shift where it is.

6.6 Conclusion

This chapter has described a method for the automated selection and weighting of

features for the similarity measure used by the CABAROST method. A genetic

algorithm is used to find a subset of weighted features by searching for combinations

of features and corresponding feature weights that increase the overall classification

accuracy of the case-base retrieval method. The increase in classification accuracy

improves the quality of the repairs that are generated by the CABAROST method by

ensuring that the repair types, and the subsequent nurses, days, and shifts involved,

are more likely to be appropriate for the constraint violations they are used to repair.

At the same time, the decrease in the number of features used to represent cases in

the case-base reduces the time needed for retrieval.

The results of the test using the genetic algorithm on real-world data has also

provided an insight into the nature of manual nurse rostering. The relative impor-

tance of features present in the roster to the making of rostering decisions has been

determined. This kind of information could be very beneficial to other researchers

who are developing nurse rostering algorithms. Meta-heuristic approaches, in partic-

ular, could benefit from the use of such knowledge in the defining of neighbourhood

structures and evaluation functions. For example, information about the staff qual-

ification levels and training should be considered when performing swap operations

on shifts or shift patterns between nurses.

The CABAROST method is enhanced by the inclusion of automatically deter-

mined feature weights. One of the fundamental characteristics of the method is its

ability to adapt to the operational priorities of different nurse rostering experts. Au-

tomatic weighting and feature selection increases this ability by focusing on the data

that is important from a given case-base of experience.

Part III

Meta-heuristic Hybrids

Chapter 7

Combining CABAROST with Tabu

Search

7.1 Introduction

The CABAROST algorithm, working alone, provides a technique for acquiring knowl-

edge about personnel rostering on a case by case basis and can be used in an interac-

tive mode, helping senior staff to build new rosters based on their previous rostering

decisions. In this chapter, the problem of automatically producing final rosters us-

ing the knowledge in the case-base is addressed. The CABAROST method is used

within a simple iterative algorithm which starts with an initial roster consisting only

of the nurses’ preferences (the preference roster) and ‘searches’ for a solution which

violates as few constraints as possible. To determine the benefits gained by using

CABAROST in this way, it will be compared with algorithms constructed using com-

mon meta-heuristic mechanisms.

Meta-heuristic approaches to rostering problems search through the solution space

by iteratively selecting solutions and exploring their neighbourhoods. The neighbour-

hood is defined by generating a number of new solutions around the current solution.

An objective function is used to choose which solution in the neighbourhood to move

to next. The traditional tabu search approach is a meta-heuristic which keeps a

memory of recently visited solutions which may not be revisited within a certain

time period (or tenure). This diversifying feature helps the search to avoid local

7. combining cabarost with tabu search 165

optima by forcing it to explore new areas of the search space.

In this chapter we will describe seven different variants of a simple meta-heuristic

framework. These algorithms will not define the neighbourhood around a solution

in the same way as ‘classical’ meta-heuristics. Instead of applying an operator to

an entire solution, these algorithms will choose a violation within a solution and

attempt to repair it. The goal of the search is to find a feasible solution, or at

least to minimise the number of constraint violations in the roster. In addition, the

nurses’ shift preferences should be satisfied wherever possible. The hard constraints

defined within the set C are used to define roster feasibility and the nurse preference

information, which is usually modelled as one or more soft constraints, is used as a

measure of roster quality. The difference with these algorithms is that we do not

represent levels of nurse satisfaction quantitatively during the search, or indeed when

generating repairs.

This chapter is organised as follows. Section 7.2 will describe the different meta-

heuristic variants incorporating CABAROST repair generation with tabu search mech-

anisms. The solutions produced by these algorithms will be compared in Section 7.3.

7.2 Algorithm Variants

A number of different mechanisms are available that can help such a local search

algorithm find good quality, feasible solutions. The seven algorithms described here

are composed of different combinations of these mechanisms. The mechanisms are:

Case-based repair generation: Repairs for constraint violations are generated us-

ing the expert knowledge in the case-base. In particular, whilst addressing the

hard constraint violations, these repairs also imitate the expert’s handling of

the nurses’ shift preferences.

Tabu Lists: Repairs are not repeated within a certain number of iterations by plac-

ing them on a tabu list of forbidden repairs. This reduces the chance that the

search will get trapped in a ‘loop’ of repeating violations and repairs.

Objective function: The search is guided by an objective function which counts the

number of violated constraints in the roster and must be minimised. Repairs

are chosen based on their ability to reduce the total number of violations in the

roster.

The motivation for defining these algorithms is to determine what effect each

mechanism has on search quality. We shall use them to show that knowledge from

a case-base can be successfully combined with traditional meta-heuristic search con-

cepts thus reducing the knowledge acquisition overhead required to model problem

domains. All of the algorithms search for new solutions by iteratively repairing con-

straint violations. In each instance the initial roster consists solely of the nurses’

individual shift preferences. This roster violates a large number of constraints and

the goal of the algorithms presented here is to repair all of these violations. It must

be emphasised that the algorithms without an objective function have no explicit rep-

resentation of this goal. In these algorithms the burden is placed on the mechanism

which generates the repairs to guide the search in the ‘correct’ direction.

The seven algorithms are described below. The first three algorithms generate

repairs randomly and utilise the tabu lists (R-TL), the objective function (R-OBJ),

and both mechanisms combined (R-OBJ-TL). The last four methods use the case-

base retrieval and adaptation methods described in the previous section to generate

repairs - the ‘pure’ CABAROST repair generation (CB), with the tabu lists (CB-TL),

with the objective function (CB-OBJ), and the final algorithm (CB-OBJ-TL) uses all

three mechanisms. All of the algorithms are based on the following iterative repair

structure:

1. While (true) Do

2. Generate PRv by applying the constraints in C to N

3. If |PRv | = 0 Then exit

4. Pick random element violationα ∈ PRv

5. Perform repair generation for violationα and apply repair to N

6. Repeat

Each algorithm described in the remainder of this section implements a different

version of Step 5. Steps 2 and 3 generate the set of constraint violations in the roster

at each iteration and Step 4 chooses a violation to be repaired. After a repair has

been generated and applied to the roster the entire process is repeated.

7.2.1 Random Repair Generation with Tabu List (R-TL)

This algorithm uses no problem solving knowledge to generate repairs but uses the

idea of tabu search proposed by Glover in [100]. A tabu list of repairs is used to

help the algorithm to avoid local optima in the number of constraint violations and

a tenure is specified which sets the length of the list (and therefore the number of

iterations for which a stored repair will be considered ‘tabu’). This tenure must be

enforced whenever a repair is added to the tabu list by removing the oldest repair.

Some help is given to the algorithm by ensuring that the parameters of the violations,

including the nurses, days, and shifts involved, are also included as parameters of the

repairs. Otherwise the choice of repair type (Reassign, Swap, or Switch) and the

other parameters involved is entirely random. No evaluation of the quality of repairs

or the degree of violation of the roster is used when deciding on repairs.

Given roster R = 〈N,C〉 and tabu list T = ∅ with tenure t :

5.1. Randomly create repairβ ∈ PRr using the parameters of violationα

as appropriate

5.2. If repairβ ∈ T Then goto 5.1

5.3. Apply repairβ to N

5.4. Add repairβ to T and update T w.r.t. t

7.2.2 Random Repair Generation with Objective Function (R-OBJ)

This algorithm generates a set of repairs for a given violation and chooses the repair

that will cause the largest decrease (or smallest increase) in the number of constraint

violations in the roster. The algorithm is a basic local search [165] and includes

no mechanism for avoiding local optima. The method for randomly generating the

repairs is the same as used for R-TL. Given roster R and repairβ, the objective

function fR(repairβ) is defined:

fR(repairβ) = (|PRv | before applying repairβ)− (|PR

v | after applying repairβ) .

Given roster R = 〈N,C〉 and objective function fR:

5.1. Randomly generate the set of all possible repairs from PRr using the

parameters of violationα

5.2. Choose the element repairβ from the set of repairs with highest

fR(repairβ)

7.2.3 Random Repair Generation with Tabu List and Objective Function (R-OBJ-TL)

The tabu list and objective function mechanisms are combined in this algorithm. It

is essentially a tabu search algorithm for the constraint satisfaction problem which

operates on specific constraint violations in the roster.

Given roster R = 〈N,C〉, objective function fR, and tabu list T = ∅ with tenure

5.1. Randomly generate the set of all possible repairs from PRr using the

parameters of violationα

5.2. Choose the element repairβ from the set of repairs with highest

fR(repairβ)

5.3. If repairβ ∈ T Then remove repairβ from the set of repairs and goto

5.5. Add repairβ to T and update T w.r.t. t

7.2.4 Case-Based Repair Generation (CB)

Here the experience stored in the case-base is used to generate repairs. It is assumed,

for the experiments that follow, that the case-base has been well trained and contains

sufficient examples of a variety of different problem solving episodes. There is no

objective function used to choose repairs - the similarity to the retrieved repairs from

the case-base drives the search. The most similar repair from the most similar case

is used at every iteration. There is no method for diversification of the search in this

algorithm. This algorithm is performing a ‘blind’ search for feasibility relying on the

quality of repairs stored in the case-base.

Given roster R = 〈N,C〉 :

5.1. Generate repairβ ∈ PRr using the case-based retrieval and adaptation

methods

7.2.5 Case-Based Repair Generation with Tabu List (CB-TL)

The R-TL algorithm described in Section 7.2.1 is not guided by any rostering knowl-

edge as the repairs for each violation are randomly generated. CABAROST guides

the search using the knowledge in the case-base but is unable to cope when violations

are repeatedly created - it will create the same repair for the violation each time

it is encountered. The diversification provided by the tabu lists and the rostering

knowledge stored in the case-base are combined in the CB-TL algorithm. The tabu

lists can store either repairs, cases, or a combination of both. If the nearest case

found in the case-base is currently on the tabu list then the next nearest case will

be retrieved. Similarly, if a repair generated is on the tabu list then the next nearest

repair generated from the retrieved case is used.

Given roster R = 〈N, C〉 and tabu lists TRepair = ∅ and TCase = ∅ with tenures

tr and tc respectively:

5.1. Retrieve the most similar case case0 ∈ CB

5.2. If case0 ∈ TCase Then discard case0 and goto 5.1

5.3. Generate repairβ ∈ PRr from case0 such that repairβ /∈ TRepair

5.4. Add case0 to TCase and repairβ to TRepair and update tabu lists

w.r.t. tr and tc

7.2.6 Case-Based Repair Generation with Objective Function (CB-OBJ)

This algorithm combines the explicit representation of the search goal using the ob-

jective function with the case-base repair generation method. Each repair generated

is scored according to a combination of its similarity to the repair from the retrieved

case and the reduction in the number of hard constraints it causes in the roster.

Given the objective function fR (Formula 7.1), candidate repair repairβ, retrieved

repair r0 = 〈rStructure0, r0〉, and roster R, the function Score is defined

Score(repairβ) = a× 1

dr(rβ, r0)+ b× fR(repairβ), (7.2)

where rβ is the feature information generalised from repairβ (see Section 5.2). The

summation weights a and b have been found through experimentation to work well

when they are equal - although this may not be the case for all problems.

Given roster R = 〈N,C〉 and score function weights a and b:

5.1. Retrieve the most similar case case0 = (v0, r0)

5.2. Generate a set of repairs based on r0 from case0

5.3. Choose the repairβ with the highest value according to Score(repairβ)

7.2.7 Case-Based Repair Generation with Tabu List and Objective Function (CB-OBJ-

This final algorithm combines all of the mechanisms. It can be described as a tabu

search for the constraint satisfaction problem with neighbourhoods determined by

the case-base repair generation method for each constraint violation (i.e. at each

iteration).

Given roster R = 〈N, C〉 and tabu lists TRepair = ∅ and TCase = ∅ with tenures

tr and tc respectively:

5.1. Retrieve the most similar case case0 = (v0, r0)

5.2. If case0 ∈ TCase Then discard case0 and goto 5.1

5.3. Generate a set of repairs based on r0 from case0

5.4. Choose the repairβ with the highest value according to Score(repairβ)

5.5. If repairβ ∈ TRepair Then remove repairβ from the set of repairs

and goto 5.4

5.6. Add case0 to TCase and repairβ to TRepair and update tabu lists

w.r.t tr and tc

7.3 Comparison of Algorithms

The algorithms were tested on real world data from the QMC using eleven different

variants:

• Case-based repair generation (CB);

• Case-based repair generation with tabu lists of cases with tenure 5 (CB-TL-

C5), of repairs with tenure 10 (CB-TL-R10), and with both tabu lists (CB-TL-

C5R10);

• Case-based repair generation with objective function (CB-OBJ);

• Case-based repair generation with objective function and tabu lists of cases

(CB-OBJ-TL-C5), repairs (CB-OBJ-TL-R10), and both cases and repairs (CB-

OBJ-TL-C5R10)

• Random repair generation with tabu list of repairs with tenure 10 (R-TL-R10);

• Random repair generation with objective function (R-OBJ);

• Random repair generation with objective function and tabu list of repairs with

tenure 10 (R-OBJ-TL-R10);

The tenures for the tabu lists were chosen during preliminary experimentation and

they showed good performance over a number of problems. It was noticed that an

increase in repair tenure causes little change in algorithm performance. However, case

tenure is very sensitive and setting this value too high can decrease the performance

of the case-based repair generation significantly. Some of the cases in the case-base

are used with higher frequency due to a higher than average occurrence of the vio-

lation that they represent in the roster. Consequently, when case tenure is increased

it is more likely that the case retrieved from the case-base for a given violation is

insufficiently similar, which could lead to the generation of an inappropriate repair.

A case-base was trained using 300 examples of violations and repairs derived from

preference and final rosters acquired from the QMC. This case-base was an expanded

version of the case-base used to do the experiments described in Chapters 5 and 6 -

examples of violations of the new constraint types (see Section 4.6) were provided as

well as some additional examples of Cover and MaxHours violations. The types of

violations represented in the case-base were not evenly spread, with cover violations

making up the majority of cases. This reflects both the proportion of violations found

in the rosters and the variety of repairs used for cover violations involving different

types of nurses and shifts. The case-base was trained according to the guidelines set

out in Section 5.7.

The algorithms were run on two test problems in which preference rosters from

the four week periods in March and April 2001 were used as initial solutions. Each

algorithm was run 10 times on each problem with a maximum of 500 iterations. The

solution with the least number of constraint violations found in each run was kept and

the results summarised in Table 7.1. The first column of results shows the mean (and

standard deviation in brackets) of the number of constraint violations in the solutions

found over each of the 10 runs of each algorithm. A value of 0 indicates that feasible

solutions were found on every run. The total number of feasible solutions found by

each algorithm (out of 10) is shown in the second column. The mean of the number

of iterations needed to get the solution and the speed (average number of violations

solved per iteration) are shown in the third and fourth columns. In the fifth column

the time taken (in seconds) to find the solution is given. The final column contains

the percentage of nurse shift preferences satisfied.

It is clear from the results in Table 7.1 that all of the algorithms that employed

case-based repair generation were able to find solutions with fewer constraint vio-

lations than those that used randomly generated repairs. Figure 7.1 compares the

mean, minimum, and maximum number of constraint violations in the solutions found

by each algorithm. The CB algorithms are also able to reach these better solutions

much faster than their random counterparts in terms of the number of iterations, and

in comparable or better time in seconds. In fact, prolonging the run times of the

random algorithms did not help them find significantly better solutions.

The nurse preference satisfaction results shown in Figure 7.2 also show that the

CB algorithms outperform their random counterparts. It should be emphasised that

the random algorithms were not given any information about nurse preference and so

they are expected to perform badly. However, these experiments do show that repairs

generated using the nurse preference information implicitly stored in the case-base

are able to guide the search in the direction of better solutions.

Tabu lists should help the algorithms to avoid local optima by reducing the oc-

currence of repeating loops of repairs and violations. Indeed, the experiments show

that the CB algorithms without tabu lists are more likely to get ‘stuck’ in such loops.

The improvements in the number of violations in the best solutions obtained by the

CB algorithms with tabu lists for both problems are statistically significant at the

Figure 7.1: Mean, maximum, and minimum number of constraint violations

0.01 confidence level1. However, the random results do not show this behaviour. In

fact the R-OBJ-TL-R10 algorithm performs slightly worse than R-OBJ, though this

difference is not significant. The lack of improvement is possibly due to the fact that

the neighbourhoods defined by the random repair generation method are too large

for such a small tabu tenure. The tabu lists are then unable to help move the search

away from local optima. The average progress of the CB, CB-TL-C5R10, R-OBJ,

and R-OBJ-TL-R10 for the MARCH 2001 problem can be seen in Figure 7.3.

These experiments do not show clearly which type of tabu list works best for the

CB algorithms. The TL-C5, TL-R10, and TL-C5R10 variants of the CB and CB-OBJ

algorithms show similar improvements. It is almost certainly true that the choice of

tabu list is dependent on the problem being solved and on the content of the case-

base. We suggest that the combined tabu list (TL-C5R10) would be the best option

for most problems because it incorporates both types of diversification.

1Using a two-sample t-test for statistical significance assuming equal population variance

Figure 7.2: Mean, maximum, and minimum nurse satisfaction

Nurse preference satisfaction is not adversely affected by the use of tabu lists with

the CB algorithms (see Figure 7.4). This issue should be approached with caution

however. It is conceivable that keeping repairs and cases on tabu lists could increase

the probability that new repairs violate nurse preference. In particular, the use of

tabu cases may force the case-base to retrieve less similar cases and therefore generate

less suitable repairs. This can be avoided by including sufficient cases in the case-base

and by keeping the tenure of the case tabu list relatively low.

Both the CB and random algorithms were significantly improved by the use of an

objective function (at the 0.05 confidence level). However the improvement to the

random algorithms by using an objective function was not as great as that gained by

using CABAROST alone. For the CB algorithms, the objective function introduced a

trade-off between repair similarity and repair quality which guided the search towards

feasibility in fewer iterations. The scoring function ensures that when very similar

repairs are generated from the case-base they will be used, but when it is not possible

Figure 7.3: Effects of tabu list on average number of constraint violations for theMARCH 2001 problem

Figure 7.4: Effects of tabu list on average nurse preference satisfaction for theMARCH 2001 problem

Figure 7.5: Effects of objective function on average number of constraint violationsfor the APRIL 2001 problem

Figure 7.6: Effects of objective function on average nurse preference satisfaction forthe APRIL 2001 problem

to generate similar repairs the objective function has a larger influence. Figure 7.5

shows the average progress of the CB, CB-OBJ, R-TL-R10, and R-OBJ-TL-R10

algorithms. A statistically significant improvement in nurse satisfaction can also be

seen in the experimental results. This can be explained by the increased speed in

which the OBJ algorithms arrived at their best solutions (see Figure 7.6). In general,

faster convergence to feasible solutions reduces the chance of unnecessary damage to

the nurse preferences.

Figures 7.7 and 7.8 show the final roster produced manually for the month of April

and a roster produced by the CB-OBJ-TL-R10 variant respectively. The rosters

indicate when a request was violated with a shaded background. Where ‘AL’ is

given as a shift the nurse had pre-arranged annual leave and changing these shift

assignments is not possible.

It is immediately evident that the CBR-Tabu hybrid generated roster violates

fewer nurse requests than the manual roster. This could indicate that when training

the case-base the expert paid more attention to satisfying shift requests - or it could

indicate that the automated algorithm found the final roster in fewer iterations thus

causing less damage to the preferences. In a number of places the CBR-Tabu roster

has identical or similar shift assignments to the manually produced roster (in fact the

rosters are identical in 328 out of 490 assignments, i.e. 66.9%, if annual leave is not

counted). An example of similar shift assignments can be found in the final week of

Malinka’s roster - where in the manual roster the assignments are EELUULE and in

the CBR-Tabu roster they are LEEUUEL. This reflects the fact that the expert treats

assignments of Early and Late shifts in a similar fashion - probably concentrating on

satisfying the cover requirements rather than worrying about the order in which shifts

occur. There are some differences in the assignment patterns for Night sifts - this can

be attributed to the lack of additional constraint types (e.g. constraints which do not

allow single night shifts to be rostered) which are taken into account by the method

given in the Chapter 8.

Overall, these results show very clearly the benefits of a hybrid approach. CABAROST

guides the search towards solutions with very few constraint violations and high nurse

preference satisfaction but is unable to reliably find feasible solutions. The addition

Figure 7.7: A manually produced roster

Figure 7.8: A roster produced using CB-OBJ-TL-R10

of tabu lists helps the search out of these ‘good quality’ local optima and increases

the chance of finding feasible solutions. The objective function increases the rate of

convergence by placing more weight on those repairs that cause a decrease in the

number of constraint violations whilst still taking advantage of the knowledge in the

case-base.

The results presented here are not intended to show that the case-based approach

can out-perform methods that employ a tabu search strategy per se. The random

methods shown here are by no means tailored to the problem - any sensible implemen-

tation would at least include explicit rules for avoiding violation of nurse preference

under specific conditions and would probably choose repairs by minimising the num-

ber of new violations created. However, the results do show that when combined

with a simple meta-heuristic algorithm, case-based repair generation can increase

performance significantly.

7.4 Conclusion

In this chapter we demonstrate that a case-base can capture expert rostering knowl-

edge by storing examples of constraint violations and their corresponding repairs.

This knowledge can then be used to generate repairs to constraint violations in new

problems. Our experiments show that a simple iterative algorithm employing only

the case-based repair generation mechanism can successfully find good quality (albeit

usually sub-optimal) solutions. We have shown that the case-based repair generation

methodology not only imitates repairs which lead to feasible or near-feasible rosters,

but that the repairs are also good at preserving the nurses’ shift preference requests

in the final roster.

The benefits of such a case-based guided search approach are threefold. The

expert-quality repair examples stored in the case-base help the search find feasibility

much faster because they guide the search in sensible directions. They help to guide

the search away from areas of the search space containing ‘bad’ local optima (solutions

which are locally optimal but contain many constraint violations). The repairs stored

in the case-base avoid violating nurse shift preferences wherever possible and so guide

the search towards feasible solutions with high nurse satisfaction. Finally, all of this

information is stored implicitly in the case-base and therefore does not need to be

hard-coded into an algorithm using explicit rostering rules.

We have shown that the quality of rosters produced using the case-based repair

methodology alone are better than those produced using two standard meta-heuristic

mechanisms, namely an objective function and tabu lists. A hybrid approach incor-

porating the case-based repair generation with tabu lists and an objective function

quickly produces good quality solutions with high levels of nurse shift preference

satisfaction.

It is clear from the results in this chapter that kind of knowledge captured within

CABAROST reduces the burden of knowledge representation within the objective

function. Most single-objective meta-heuristic methods must represent preferential

treatment of constraint violations using a system of constraint weights, which can

be difficult to discern automatically or from domain experts. By demanding that

experts supply rostering knowledge by example, the CBR based method reduces the

risk, present in methods which employ weighting, of misrepresentation and misunder-

standing.

|PRv | #Feas #Its Spd Secs NSat (%)

MARCH 2001CB 1.4 (1.07) 2 195.4 0.402 17.8 88.5 (0.90)

CB-TL-C5 0.0 (0.00) 10 205.5 0.410 18.7 87.2 (1.81)

CB-TL-R10 0.7 (0.67) 4 181.1 0.440 16.5 88.7 (1.00)

CB-TL-C5R10 0.2 (0.42) 8 189.0 0.442 17.2 88.5 (1.81)

CB-OBJ 0.3 (0.48) 7 119.4 0.681 16.8 90.3 (1.26)

CB-OBJ-TL-C5 0.2 (0.42) 8 117.7 0.676 16.6 90.4 (1.34)

CB-OBJ-TL-R10 0.1 (0.32) 9 110.4 0.720 15.6 90.1 (2.19)

CB-OBJ-TL-C5R10 0.1 (0.32) 9 117.6 0.680 16.6 90.3 (1.28)

R-TL-R10 43.2 (5.83) 0 435.4 0.084 17.6 61.2 (5.74)

R-OBJ 19.6 (0.84) 0 220.9 0.292 71.5 44.8 (1.90)

R-OBJ-TL-R10 20.0 (1.49) 0 218.6 0.315 70.8 46.4 (2.41)

APRIL 2001CB 1.0 (0.94) 3 151.0 0.591 13.8 88.1 (1.74)

CB-TL-C5 0.3 (0.48) 7 153.7 0.587 14.0 87.6 (2.28)

CB-TL-R10 0.0 (0.00) 10 142.2 0.629 13.0 88.0 (1.57)

CB-TL-C5R10 0.0 (0.00) 10 156.2 0.581 14.2 87.3 (0.91)

CB-OBJ 0.1 (0.32) 9 114.5 0.812 16.2 88.7 (1.80)

CB-OBJ-TL-C5 0.1 (0.32) 9 106.1 0.850 15.0 90.6 (1.95)

CB-OBJ-TL-R10 0.0 (0.00) 10 104.5 0.861 14.7 90.7 (2.36)

CB-OBJ-TL-C5R10 0.1 (0.32) 9 105.7 0.850 14.9 90.4 (1.82)

R-TL-R10 37.9 (8.88) 0 436.0 0.118 17.6 59.2 (4.52)

R-OBJ 18.9 (0.88) 0 265.4 0.326 85.9 49.9 (2.10)

R-OBJ-TL-R10 19.5 (0.97) 0 314.4 0.244 101.8 48.1 (1.76)

Table 7.1: Algorithm Performance.

Chapter 8

A Memetic Algorithm for Determining

Repair Orderings

8.1 Introduction

The CABAROST system generates repairs for violations that appear in rosters. It

is possible to repeatedly attempt to solve every violation in a roster in an iterative

fashion, using techniques such as those proposed in Chapter 7. However, the order in

which repairs are applied to the roster has a great effect on the quality of the final so-

lution. In this chapter the problem of finding good quality sequences of CABAROST

generated repairs is addressed. A memetic algorithm, which is a hybridisation of a

genetic algorithm and CBR, was developed to take sequences of repairs produced by

CABAROST and attempt to evolve good quality orderings.

Memetic algorithms [195] combine the concept of biological evolution used by ge-

netic algorithms with the optimisation powers of local search algorithms. There are

a number of different ways to incorporate local search algorithms into the genetic

algorithm. Usually the members of a population are locally optimised between each

generation. Consequently the genetic operators act only on the locally optimal indi-

viduals generated by the local search thus theoretically reducing the overall size of

the search space [59]. Here, the local search used by traditional memetic algorithms

is replaced by the CABAROST repair generation. This is applied iteratively to repair

a small number of violations each time it is invoked. Consequently, the size of the

8. a memetic algorithm for determining repair orderings 185

repair sequences produced by the algorithm increases with each generation as more

and more constraint violations are addressed.

The problem of learning from failure is also addressed in this chapter. In the

context of repair sequences failure is defined as the reappearance of a violation that

has already been repaired. When this happens the case from which the repair was

generated has to be penalised to reduce the possibility that it will be selected in future.

In order to accomplish this, a case weighting strategy that would enable the system

to avoid repetition of the same mistake again is proposed. This strategy compliments

the optimisation characteristics of the memetic algorithm, and improves the quality

of the rosters produced over a series of experiments.

The combination of CABAROST with a genetic algorithm allows on-going training

to be carried out to improve the quality of the case-base. Case acceptance thresholds

can be defined which detect when users might need to verify or correct certain repair

actions. In this chapter the use of the memetic hybrid in an interactive fashion is

explored.

The chapter is organised as follows. In Section 8.2 the memetic algorithm is

introduced. Section 8.3 describes the strategy for weighting failed cases. In Section

8.4 some experiments on the QMC data are described and the results are discussed.

8.2 Memetic Algorithm

The memetic algorithm that was developed evolves populations of A chromosomes

which consist of variable length sequences of CABAROST generated repairs. Each

repair sequence sa has the following form:

sa = repaira,0, repaira,1, . . . , repaira,Ba−1, 0 ≤ a < A (8.1)

where Ba is the length of sequence sa.

Initially a population of short repair sequences is created. Each sequence is gen-

erated from the initial preference roster, from which violations are chosen at random

and then repaired using CABAROST. The length of the sequence is also randomly

chosen from an interval which is a parameter of the search.

The fitness of a repair sequence is measured by summing the magnitudes of all

the constraint violations that occur in a roster after the sequence has been applied.

For example, the magnitude of a cover violation is equal to the shortfall (the number

of extra nurses required to satisfy the constraint). The magnitude of MaxHours and

MinHours violations is the number of hours over or under the maximum or minimum

specified in the constraints, respectively. The constraint violations are not weighted in

any way. The specification of relative importance of constraints is stored in the case-

base. The nature of the repairs used for the different types of violations, and indeed

the different repairs used for different violations of the same type, implicitly represent

relative constraint importance. The reciprocal of the summed violation magnitudes is

used by the selection operator, thus turning the problem into a maximisation problem.

At each generation the population is manipulated firstly by the genetic operators

of selection, crossover, and mutation, followed by the memetic operation which adds

more repairs to each of the sequences by repairing more violations from the roster

using CABAROST. The genetic operators work towards finding good quality permu-

tations of the existing repair sequences, whilst the memetic stage increases the size

of the sequences thus allowing them to repair more violations within the roster.

One-point crossover [31] with a repetition filter was used to combine two parents

with a 0.8 probability. This operation chooses a position in each of two chromosomes

at random and then creates two children by swapping the parent’s values to the right

of the selected positions and removing all repetitions of repairs. Given two children

x = x0, x1, . . . , xX−1 and y = y0, y1, . . . , yY−1, and crossover points cX ∈ [0, X − 1]

and cY ∈ [0, Y − 1] the two children produced by a crossover of x and y are:

x’ = x0, . . . , xcX, ycY +1, . . . , yY−1 s.t. yi 6= xj, 0 ≤ j ≤ cX (8.2)

y’ = y0, . . . , ycY, xcX+1, . . . , xX−1 s.t. xi 6= yj, 0 ≤ j ≤ cY (8.3)

The removal of repetitions from the repairs sequences is an important part of

the crossover operation. Repeated copies of Reassign repairs will not have any effect

on the roster and are redundant. When swap and switch repairs are repeated they

effectively ‘undo’ themselves (for example, a swap repair applied twice will result in

unchanged assignments for the nurses involved). Consequently, if they remain in the

repair sequence after crossover they will reverse the repairs of previous violations,

thus increasing the total number of violations in the roster and therefore reducing the

fitness of the chromosome.

Parents were selected for the crossover operator using a roulette wheel selection

process [165]. In this method, individuals are selected with a probability equal to

the proportion that their calculated fitness contributes to the sum of the fitness of

all members of the population. An elitist strategy is used, guaranteeing that the

fittest member of the old population is copied into the new population. This strategy

ensures that at least one member of the new population is of the highest quality found

thus far.

There are two mutation operators that are applied to members of the new popu-

lation. The delete mutation chooses a random number of repairs to delete from each

sequence. The number of deleted repairs is restricted to the interval [1,5]. The delete

mutation helps to limit the size of the sequences over time by removing repairs which

do not contribute to the overall chromosome quality. The swap mutation changes

the order of the repairs in a sequence by choosing a swap point and then swapping

all of the repairs that were before this point to the end of the sequence. Each of the

mutation operators is applied to sequences with a 0.1 probability.

After the genetic operators have been applied to the chromosomes in a population

each of them is extended by a random number of repairs using the CABAROST repair

generation method. The repair sequences are successively applied to the roster and

then a random selection of the remaining (and potentially newly generated) violations

are repaired. Again, after each sequence is extended the roster is reset back to its

original state.

8.3 Learning

A fundamental property of CBR systems is that of learning capability. During the use

of the memetic algorithm learning can be performed in two ways. When CABAROST

is not confident about the quality of repairs it produced it can prompt the user to

confirm or suggest an alternative. The distance between the generated repairs and the

repairs in the retrieved cases is considered. A confidence threshold is established for

repairs generated by CABAROST. If a repair has distance greater than this threshold

then the user is queried. This mechanism has the effect of increasing the coverage

of the cases in the case-base over time. The threshold can be set at different levels

depending on the amount of interaction desired. If the user would like to interact

frequently with search process then it should be set at a low distance. If it is set at

a high value then the user will be prompted less often.

Learning is driven not only by successfully repaired violations but also by repairs

that are considered to be failures. In this problem a failure is defined as the reappear-

ance of a violation after it has already been repaired using a case from the case-base.

This is assessed each time a repair is added to each repair sequence in the population.

A record is kept of the violation that was solved by each repair in the repair sequence.

When a violation is selected for repair the record is checked and if it has previously

been solved then the case which was used to generate the repair is deemed to have

failed.

The failure of a case is registered in the case-base by increasing the overall case

weighting (Wγ) used in formula 5.9. This increase in weight acts as a penalty on the

case. The effect of an increase in the weighting of a case is to increase its distance

from any future violations - thus reducing the chance that it will be used to solve

future violations.

It is likely that the majority of cases in the case-base will fail at some point due

to the highly constrained nature of rostering problems. Consequently, it is important

that the increase in weight upon failure is small so that cases with a low rate of

failure are still used to repair violations. The weighting is a gradual process which

will penalise most the cases which consistently fail.

8.4 System Evaluation

The CABAROST method depends very strongly on the quality of the case-base. A

set of experiments was designed to: (a) show the performance of the case-base as a

repair generation tool within a meta-heuristic algorithm; (b) explore the effects of

ongoing training of the case-base throughout its use; and (c) investigate the use of

case weighting as a means by which to penalise failure.

A new case-base was trained for these experiments using the method described in

Section 5.7 using violations and repairs of one month of rostering data (MARCH2001)

from the QMC. The case-base had to be retrained from the versions used in previous

chapters due to the new repair adaptation algorithm - the information required could

not be derived from the existing cases. This case-base was trained with experience

of all of the constraint types listed in Section 4.6. In this problem instance 20 nurses

were rostered over a four week period according to the constraints listed in Table 8.1.

The case-base contained 189 cases, with cases containing mix of violations listed

in Table 8.2. The number of cases stored for each violation is a reflection of the

variety of different decisions that can be made. For example, HardRequest violations

(when a nurse’s preferred shift is not assigned) are always repaired by swapping the

affected nurse’s assigned shift with a nurse of any other type who happens to be

currently assigned the requested shift. The number of cases stored is low because

this repair behaviour takes place in all circumstances. Cover violations, however,

require different repairs to be used in different circumstances and so a large number

of contrasting examples are used in the case-base.

In the experiments described in this section the memetic algorithm is applied to

the problem instances for the five months after March 2001. We investigate a number

of variations of the algorithm. Each variation of the algorithm was run 20 times for

each problem instance. All of the roster quality results presented are averages (and

standard deviations) of the best solution found during each of the 20 runs. The roster

quality results represent the sum of the magnitudes of the violations in the roster - i.e.

they are the reciprocals of the fitness values calculated within the memetic algorithm.

The first experiment compares the results achieved by using the memetic algorithm

Table 8.1: Constraints for the QMC problem

Constraint Type DescriptionCover EARLY shifts require 4 Qualified NursesCover EARLY shifts require 1 Registered NurseCover EARLY shifts require 1 Eye-Trained NurseCover EARLY shifts require 1 Auxiliary NurseCover LATE shifts require 3 Qualified NursesCover LATE shifts require 1 Registered NurseCover LATE shifts require 1 Eye-Trained NurseCover LATE shifts require 1 Auxiliary NurseCover NIGHT shifts require 2 Qualified NursesCover NIGHT shifts require 1 Eye-Trained NurseCover NIGHT shifts require 1 Auxiliary Nurse

MaxDaysOn maximum number of consecutive shifts is 6MaxHours maximum number of hours in a fortnight (14 days) is 75MaxHours maximum number of hours is as set in nurse contracts

MinDaysOn minimum number of consecutive shifts is 2MinHours minimum number of hours is 5 less than the contract maximum

SingleNight nurses can not work only one night shiftSuccession an EARLY shift must not follow a NIGHT shift

WeekendBalance no more than 3 weekends out of every 4WeekendsInARow no more than 3 weekends in a row

WeekendSplit either both days of a weekend or not at all

Table 8.2: Initial case-base contents

Violation Type NumberCover 54

HardRequest 12MaxDaysOn 7

MaxHours 13MinDaysOn 20

MinHours 46SingleNights 3SoftRequest 14

Succession 4WeekendBalance 5

WeekendsInARow 6WeekendSplit 5

Table 8.3: Roster quality - CABAROST vs. Random Repair Generation

Month Initial Random CABAROSTQuality Time Quality Time

April2001 545.00 197.30 (21.67) 365.2 161.34 (7.01) 623.5May2001 592.75 224.35 (38.07) 392.4 198.01 (10.49) 589.8June2001 698.25 243.84 (31.41) 315.2 229.73 (23.29) 623.4July2001 530.00 271.14 (41.69) 341.5 233.15 (17.36) 821.9

August2001 675.25 234.51 (36.91) 299.3 189.04 (24.29) 615.2

with the CABAROST generated repairs, and with repairs that have been created at

random. The random repair generation uses the basic parameter information for the

focus violation and creates a repair of random type, with random nurses and shifts

involved. By generating the repairs randomly the memetic algorithm is effectively

performing a search ‘by itself’ - i.e. with no knowledge coming from the case-base.

In this experiment the case-base trained using the MARCH2001 data was used with

no additional cases added.

Table 8.3 shows the results for the first experiment. The first column of results

shows the starting quality of each month’s roster (i.e. the preference roster). In next

two columns the results of the memetic algorithm with random repair generation

are given for each month along with the average time taken (in seconds) to reach

the best solution. The final columns shows the qualities and processing times of the

solutions produced by the memetic algorithm with CABAROST generated repairs.

The results show that the knowledge contained in the case-base significantly improves

the performance of the memetic algorithm. The quality of the solutions produced is

better using CABAROST repairs for all five problem instances. The CABAROST

repair generation also leads to more consistent solutions. This is demonstrated by

the lower standard deviations of the results of the twenty runs.

In the second experiment the effect of on-going training on the quality of solutions

is assessed. On-going training of the case-base is performed by allowing the user to

interactively participate in the search process. Whenever a repair is generated for a

violation the repair distance is measured and this can be interpreted as a measure of

confidence in the quality of the repair. If this repair distance is larger than a certain

Table 8.4: Roster quality: static (no training) vs. ongoing training

Month Static Case-Base Threshold = 1.5 Threshold = 2.5April2001 161.34 (7.01) 145.63 (9.64) 155.15 (11.28)

May2001 198.01 (10.49) 161.18 (12.64) 182.15 (13.16)

June2001 229.73 (23.29) 206.73 (10.17) 210.38 (10.85)

July2001 233.15 (17.36) 187.86 (17.02) 223.97 (16.30)

August2001 189.04 (24.29) 181.15 (8.38) 186.54 (11.23)

threshold value then the user is asked to confirm the repair. The user can accept the

generated repair or improve it if it is not satisfactory. The new repair is added to the

case-base thus increasing the experience it contains.

The memetic algorithm is used to solve each of the problem instances in turn.

Any new cases added to the case-base during the run of the memetic algorithm are

kept and used in the case-base for the following month. The algorithm is used with

repair confidence thresholds of 1.5 and 2.5, and also with no on-going training (where

all repairs retreived from the case-base are accepted and only MARCH2001 is used

for training).

Table 8.4 shows the average results over the 20 runs of the algorithms. The first

column shows the average and standard deviations of the results achieved by the

memetic algorithm using CABAROST and the static case-base (trained using only

the MARCH2001 data). The second and third columns show the results using on-

going training with repair confidence thresholds set at 1.5 and 2.5, respectively. These

results are also presented graphically in Figure 8.1

It is evident that performing constant training of the case-base gives a consid-

erable advantage. For most problems the performance of the algorithm improved

significantly when it used the 1.5 acceptance threshold. As would be expected, the

performance increase was greater for the 1.5 acceptance threshold than it was for 2.5

because of the increased feedback from the user. The experience that is added to the

case-base about one months problem also improves the performance of the memetic

algorithm for later problems.

The number of times that the user is asked for feedback about repairs is an im-

Figure 8.1: Roster quality: static (no training) vs. ongoing training

portant issue to be considered. An algorithm which requires consistently high levels

of user interaction may become tedious to use. Figure 8.2 shows the sharp decrease

in the average number of times that the user is asked for input for each problem.

For the APRIL2001 problem an average of 105.2 queries were presented to the user

when the threshold was set at 1.5. This had reduced to just 12 for the AUGUST2001

problem, 5 months later. The figure also shows the difference in number of queries

when different thresholds are used. By allowing the user to choose the threshold they

can specify the amount of input they want to have depending on how much time they

want to spend supervising the algorithm.

The third experiment is used to show that the penalising of cases which perform

badly in the search by using case weights increases the overall performance of the

memetic algorithm. The amount which is added to the case weight when the case has

failed is set at 0.1, 0.5, and 1. It is expected that the value of 1 will not lead to good

quality rosters because its effect on the case-base is too severe. Increments of 0.1 and

0.5 have a more gradual effect on performance by penalising most heavily those cases

which consistently fail.

Table 8.5 shows the results for the different case weighting levels. The first column

Figure 8.2: On-going training: number of queries (1.5 and 2.5 thresholds)

Table 8.5: Roster quality: static (no case weighting) vs. case weighting on failure

Month No Weighting Inc. = 0.1 Inc. = 0.5 Inc. = 1.0April2001 161.34 (7.01) 154.95 (6.27) 159.74 (6.86) 179.01 (24.65)

May2001 198.01 (10.49) 185.73 (14.93) 195.63 (14.24) 216.48 (34.80)

June2001 229.73 (23.29) 213.15 (5.45) 199.86 (7.40) 234.86 (30.27)

July2001 233.15 (17.36) 227.99 (13.22) 234.14 (16.30) 251.17 (34.92)

August2001 189.04 (24.29) 176.75 (15.33) 188.53 (26.84) 216.35 (42.23)

Figure 8.3: Roster quality: case weighting on failure

shows the results for the memetic algorithm with CABAROST generated repairs

and no case weighting. The second, third, and fourth columns show the results for

increments of 0.1, 0.5 and 1.0 respectively. The results are represented graphically in

Figure 8.3.

The results show that the weighting of failed cases increases the overall perfor-

mance of the memetic algorithm. The increment level 0.1 performed best on average

for all but one of the months data. An increment level of 0.5 shows a small improve-

ment on the un-weighted algorithm, except for the JULY2001 problem instance. An

increment level of 1.0 is clearly too high and the algorithm performed considerably

worse when this value was used.

Figure 8.4 shows a roster produced by the memetic algorithm for the month of

April. This result can be compared to the manual and CBR-Tabu results given in

Figures 7.7 and 7.8 in Chapter 7. Again, the shaded background indicates violations of

shift requests and the ‘AL’ symbol indicates Annual Leave which cannot be changed.

The roster produced using the memetic algorithm shows some qualitative im-

provements over the CBR-Tabu result given in Figure 7.8. These improvements can

be attributed to the additional constraints defined - constraints which could not be

Figure 8.4: A roster produced using the memetic algorithm

handled very well by the CBR-Tabu algorithms. For example, in the memetic re-

sult weekend shifts are allocated more evenly due to the definition of three weekend

constraints. There are a number of examples of violations of these constraints in

the CBR-Tabu roster. Fiona, Claire, Anita, Tess, Daryl, and Linda all have shift

assignments which violate both the “three in a row” and “three out of four” weekend

constraints in the CBR-Tabu solution. There are no violations of these constraints

in the memetic solution. However, the constraint which requires that either both

or neither of the days in a weekend are assigned seems harder to satisfy and in the

memetic solution it is still violated in a number of places. This is also the case for the

SingleNight constraint - which requires that night shifts are not assigned in isolation.

It is interesting to note that both of these constraint types are also frequently violated

in the manually produced solution - indicating that these constraints are considered

to be less important by the rostering expert. We can conclude that this rostering

‘behaviour’ has been captured successfully by the case-base.

The memetic roster violates fewer nurse requests than both the CBR-Tabu and

manual rosters - in fact it leaves the nurse preferences almost entirely intact. Al-

though, due to a lack of space the details are not given, the memetic roster also

conforms more closely to the working hours constraints than the CBR-Tabu result.

The nurses’ total assignment time is close to their contract specified hours. In this

sense, the memetic result is more similar to the manually produced roster.

Overall, the results show that the knowledge that can be captured and used

through case-based repair generation and the combinatorial search capabilities of

genetic algorithms can be successfully combined in a novel memetic algorithm. This

algorithm is capable of generating good quality sequences of repairs based on the

knowledge and experience held in the case-base.

8.5 Conclusion

This chapter has described a novel memetic algorithm. Sequences of repairs are

produced using the CABAROST system. Obviously the order of the repairs in these

sequences affects the quality of the final roster. A memetic algorithm searches for

optimal sequences of repairs generated by the CBR system. This memetic algorithm

differs from others in the literature through its use of CABAROST in place of the

local search that would normally be applied to populations between generations. The

results of experiments on the data from the QMC indicate that this hybridisation

provides an excellent tool for solving nurse rostering problems.

The quality and coverage of the case-base is maintained through a combination

of on-going training and case weighting. By using a case acceptance threshold the

algorithm can determine those instances for which it can not be confident that the

decision it has made is correct. The user can set this threshold depending on the

amount of input they wish to have in the process. Cases which generate repairs which

are not of a high quality are detected by the memetic algorithm by monitoring the

appearance and reappearance of constraint violations. When a violation reappears in

the roster the case that was used to solve it is penalised by increasing its weight. This

increase in weight means that the case will be less likely to be used to solve future

similar violations.

This chapter has shown that CBR can be used to improve the performance of

evolutionary approaches to constraint optimisation. The memetic framework provides

an ideal basis for both detection of case failure, and for interactive involvement of

the user. The results show that using CABAROST, failure weighting strategies, and

interactive learning, the quality of the rosters produced is improved.

Part IV

Discussion

Chapter 9

Conclusions

This chapter provides a summary and discussion of the contents of the thesis. Section

9.1 begins with an analysis of the contribution which has been made, in terms of the

research objectives enumerated in the introduction chapter. Some of the key successes

are highlighted and discussed in a wider context. Details of the dissemination of the

research in publications and conferences are given in Section 9.5. In Section 9.2 the

comparison of the CABAROST system with other rostering methods is discussed and

in Section 9.3 some guidelines for applying similar case-based reasoning methods to

other combinatorial optimisation problems are described. The chapter concludes in

Section 9.4 with a discussion on some future research issues which should be addressed.

9.1 Contribution

The contribution made in this thesis are divided into two parts. The issues pertaining

to the problem of personnel rostering are described first, followed by the contribution

made within the field of CBR.

9.1.1 Personnel Rostering

(a) Personnel rostering problems from the literature and from hospitals have been

analysed (see Chapter 2). This thesis has presented details of the key parame-

ters, constraints, and dimensions which are present in rostering problems. The

overwhelming complexity of these problems has been described, especially the

9. conclusions 201

differing treatments of staff qualifications, working and shift patterns, and of

the many different constraints which make rostering problems difficult to solve.

The issues of self-scheduling and participatory rostering practices have been ex-

plored in light of the more recent awareness that involvement in the rostering

process can considerably increase staff morale.

(b) Many of the most influential and successful automated rostering methods for

solving rostering problems have been described (see Chapter 2). The progress

of rostering research over the years has been charted, from the initial simplified

models which employed computationally expensive mathematical programming

tools, to the most modern meta-heuristic, constraint programming, and artificial

intelligence based approaches, which are capable of solving problems of ‘real-

world’ complexity.

(c) A real-world rostering problem from a UK hospital has been modelled (see Chap-

ter 4). The nurse rostering problems is modelled as a set of constraints which

should be satisfied (if possible). A complex set of personnel characteristics can

be defined which are simultaneously hierarchical and overlapping, and include

such details as qualifications, specialty training, experience, gender, and inter-

national status. Customisable constraint types are defined which can be applied

to all employees, or over a subset of employees with certain characteristics. Re-

quests made by members of staff for particular shifts are collected in preference

rosters which later become the starting point of the rostering process. The

problem model was designed to be as generally applicable as possible and could

be used to model a wide variety of the problems described in the literature.

(d) This thesis has investigated the nature of human expert decision making for

personnel rostering. In Chapter 4 a set of simple repair operations was identified.

These repairs represented the kind of actions which manual rostering experts use

to try and resolve constraint violations. In Chapter 5 the CABAROST system

is described, which stores cases consisting of individual examples of constraint

violations and the corresponding repair operation which was used by a rostering

9. conclusions 202

expert to solve them. When a new constraint is given to CABAROST the

first process involves searching through the case-base to find the cases which

contain the most similar constraint violation. The repairs contained in the

most similar cases are then adapted by generating sets of candidate repairs

(consisting of nurses from the current roster) and ranking these according to

their similarity to the retrieved repairs. The CABAROST system provides a

very flexible tool for collecting rostering knowledge from rostering experts. It

allows experts to demonstrate their treatment of constraints on a literally ‘case-

by-case’ basis. For example, a nurse manager treats staff requests in different

ways according to factors including (but by no means limited to) the importance

of other constraints on the day of requests, the number and skill of the other

nurses available on the day, and the number of requests that the nurses have

satisfied (or indeed violated) over the rest of the planning period.

The retrieval and adaptation phases both use notions of similarity when gener-

ating repairs for constraint violations. Cases consist of sets of feature indices.

These are characteristics of the constraint violations and repairs which are rele-

vant to the rostering decisions which must be made. In Chapter 6 the different

feature types which describe constraint violations are described. The initial

sets of features were determined through consultation with the rostering ex-

perts at the QMC. To refine these large initial sets, a data mining method

was developed which simultaneously selected relevant features and determined

their relative importance. A genetic algorithm is used to search for sets of fea-

tures, and feature weights, which maximise the classification accuracy of the

CABAROST retrieval phase. This ‘off-line’ approach improved the quality of

repairs produced by CABAROST. It also gave an insight into the nature of deci-

sion making for personnel rostering problems by identifying those characteristics

of the roster which were relevant to the repairing of constraint violations.

(e) CABAROST has been applied to solve rostering problems in a number of ways.

In interactive mode CABAROST can be used by rostering experts to guide

the rostering process. Rostering experts can choose violations to repair from

9. conclusions 203

a roster and CABAROST will present them with repairs which are similar to

those which have been used for similar problems in the past. The expert can

then choose from the most similar repairs and the chosen repair will be applied

to the roster. If none of the repairs are suitable then the expert can specify a

different repair based on their knowledge and experience. This new constraint

violation and corresponding repair can then be stored in the case-base thus

increasing the knowledge contained.

To produce entire rostering solutions CABAROST is combined with meta-

heuristic tools. These hybrid approaches attempt to find solutions containing

minimal numbers of constraint violations. In Chapter 7 a set of algorithms were

presented which iteratively addressed the constraint violations in a roster. The

algorithms were constructed from different mechanisms, including tabu lists,

objective functions, random repair generation, and CABAROST repair genera-

tion. The results presented demonstrated that CABAROST generated repairs

significantly increased the performance of the algorithms in which it was used.

(f) The CABAROST method described in Chapter 5 treated constraint violations in

rosters as individual problems. Constraint violations are addressed using single

repair operations. When a constraint violation is repaired one of these three

scenarios results:

(i) The constraint violation is unaffected (the repair has failed);

(ii) The constraint violation is removed (the repair has succeeded);

(iii) The magnitude of the constraint violation is reduced in which case we

consider the resulting violation of the constraint to be new.

The repair adaptation rules ensure that the first scenario doesn’t happen by

guaranteeing that the generated repairs will always have an effect on the con-

straint violation. By treating the repair of constraint violations in this way

CABAROST can model the solution to constraint violations by recording both

the repairs to violations and the subsequent repairs made to the new violations

9. conclusions 204

which result. Using the extended adaptation algorithm the new violations cre-

ated as a result of a repair are also recorded in the case-base. This information

represents the acceptability of violating constraints in specific circumstances.

Repair ordering knowledge is not applied directly through the case-base mech-

anism, but is exploited successfully through the hybrid memetic algorithm de-

scribed in Chapter 8. This algorithm searches for sequences of CABAROST

generated repairs which lead to minimally violated rosters.

9.1.2 Case-Based Reasoning

(a) The use of CBR for real-world problems was explored, particularly in the problem

domains of scheduling and planning, and combinatorial and constraint optimi-

sation in general (see Chapter 3). The basic CBR framework was presented as

a methodology for solving complex problems with incomplete domain models.

Some of the successful implementations of CBR for scheduling and planning

problems were discussed.

(b) The suitability of CBR as a methodology for solving rostering problems was ex-

plored. Some of the defining characteristics of rostering problems lead naturally

to CBR solutions. In particular, the somewhat subjective nature of decision

making in rostering causes difficulty when it comes to systematically eliciting

knowledge from experts. CBR provides a means by which experts can provide

knowledge by example. This contrasts with existing approaches which represent

domain knowledge by defining sets of ‘IF-THEN’ rules which can be restric-

tive and are often inaccurate, or by confining knowledge to utility functions

which measure roster quality by summing violations of arbitrarily weighted

constraints.

(c) Problem solving knowledge stored in cases must be general enough that it can be

applied to a wide range of problems. In this thesis the problems of generalising

domain objects for storage in the case-base was addressed (see Chapter 5). Vi-

olations and repairs are stored in cases without any ‘roster-specific’ knowledge.

9. conclusions 205

Such knowledge includes details about the actual nurses, days, and sometimes

shifts involved. The information stored in the case-base is more general, and

describes nurse characteristics (such as qualifications or specialty training) and

information about the types of shifts already assigned on and around the af-

fected days. By storing only generalised information it is possible to uses cases

stored during the solving of one roster to repair constraint violations in many

future rosters, even those consisting of entirely different sets of nursing person-

(d) Chapter 5 describes the CABAROST system for personnel rostering. Algorithms

for retrieving and adapting cases from the case-base were developed. The re-

trieval algorithm consisted of two phases. In the first phase all of the cases in

the case-base containing constraint violations of the same type as the current

problem were identified. These cases were then ranked according to the distance

from their violation feature vectors to the feature vector of the new violation.

The adaptation phase generated sets of repairs of the same type as those in

the retrieved cases. These generated repairs are then ranked according to their

distance from the retrieved repairs.

The CABAROST method was designed to imitate the behaviour of human ex-

perts. The performance of CABAROST in this respect has been measured

in two ways. In Chapter 5 the decisions made by CABAROST were compared

with those of a rostering expert. It was shown that the accuracy of CABAROST

improved with the addition of more cases to the case-base. Furthermore, the

decisions made by CABAROST were similar to those made by the expert for

a high percentage of the constraint violations presented. In Chapter 6 the per-

formance of CABAROST was measured in terms of its classification accuracy.

This is a measure of the percentage of cases in the case-base which can be iden-

tically solved if they are removed. For most of the constraint violation types

CABAROST performed well, and this performance was enhanced further by the

genetic algorithm which selected and weighted the constraint violation features

according to their relevance.

9. conclusions 206

(e) It has been shown in this thesis that the hybridisation of CBR systems with

meta-heuristic search methods is a promising research area. Chapter 7 described

an integration of CABAROST with tabu search concepts and in Chapter 8 a

memetic algorithm was developed. In the experiments performed on the data

from the QMC the use of CABAROST generated repairs greatly increased the

performance of the hybrid approaches.

(f) Guidelines for the initial training of the case-base were given in Chapter 5. If

followed, these guidelines ensure that the cases stored in the case-base will

be useful for future reasoning and ensure that there is sufficient knowledge in

the case-base to solve a wide range of problems. In Chapter 8 an interactive

memetic algorithm is presented which prompts the decision maker for additional

knowledge if the decisions made do not fall within a certain confidence limit

based on the distance between generated and retrieved repairs. It is shown that

this form of on-going training significantly increases the performance of the

‘stand-alone’ memetic algorithm. To ensure that decision maker is not required

to answer ‘too many’ queries the confidence threshold can be set to represent

the amount of interactivity desired.

(g) In general, a CBR system should be able to learn from failure. The memetic

algorithm in Chapter 8 searches for sequences of CABAROST generated repairs

which lead to minimal violations in the final roster. In the context of repair

sequences failure is defined as the reappearance of a violation that has already

been repaired. When this happens the case from which the repair was generated

has to be penalised to reduce the possibility that it will be selected in future.

In order to accomplish this, a case weighting strategy that enables the system

to avoid repetition of the same mistake again is proposed. The experimental

results show that the application of this strategy, over time, increases the quality

of final rosters.

(h) This research has investigated technologies for maintaining case-base perfor-

mance. Chapter 6 describes a genetic algorithm for selecting and weighting the

9. conclusions 207

features used to describe constraint violations. The application of this off-line

algorithm increases the performance of CABAROST in two ways. The accuracy

of the decision making is improved because only features relevant to the prob-

lem are used to generate repairs. The reduction in the number of features also

reduces the storage requirements for cases in the cases-base. The method for

learning from case failure has a positive side-effect, namely that it can be used

to limit the size of the case-base to only those cases which are performing well.

Cases which receive frequent penalty for their failure can be identified during

off-line processing and removed from the case-base. A reduction in case-base

size will increase the speed with which cases are retrieved.

In the introduction chapter three key areas of deficiency were identified in existing

approaches to personnel rostering. The following sections discuss these areas in turn,

and describe how successful the methods described in this thesis are at addressing

9.1.3 Knowledge Acquisition

The problems of reliably eliciting domain knowledge from experts in complex, ill-

defined domains is well documented. This so called knowledge acquisition bottle-neck

occurs because operational research practitioners often have difficulty translating the

management requirements of domain experts into the mathematical models needed to

apply most OR methods. For example, rule based approaches require the reasoning

of domain experts to be represented by sets of ‘IF-THEN’ rules. Acquiring knowledge

in this way is difficult and time-consuming, and frequently leads to inaccurate and

incomplete domain models.

In general, automated rostering methods represent domain knowledge in a limited

number of ways. The main ‘locations’ for such knowledge are:

- Objective functions : Arguably, most of the ‘reasoning’ knowledge about rostering

problems is represented by an objective (utility or fitness) function. For the

majority of problems, no matter what the method, this function consists of

9. conclusions 208

a weighted sum of the constraint violations in the roster. The weights are

set, in consultation with domain experts, to represent the relative importance

of constraints. The effect of a higher weight assigned to a constraint type is a

more severe penalty added to the objective function if the constraint is violated.

The ‘bottle-neck’ with this form of representation is the setting of these weights.

Indeed, the most successful approaches applied to real problems have been those

which allow the user to interactively change the weights until the solution fits

their requirements.

- Constraints : Reasoning knowledge can be represented as complex feasibility con-

straints on the possible instantiations of decision variables. Constraint reason-

ing methods have had some success representing knowledge in this way because

they provide a fairly intuitive ‘language’ for representing constraint informa-

tion. However, it is not easy to represent all forms of reasoning in this way,

particularly when requirements are not feasibility defining but rather represent

desirable solution characteristics. In general, these requirements are still repre-

sented purely in the definition of the objective function.

- Neighbourhood definitions : Meta-heuristic search methods can represent a large

amount of knowledge in the definition of ‘neighbourhoods’ around current solu-

tions. These definitions tend to be very problem specific, and sometimes lack a

clearly intuitive link to the operational requirements which they are supposed

to model. Consequently, methods developed using such definitions tend to be

excellent at solving the problems they were designed for, but cannot easily be

used to solve new problems.

Case-based reasoning provides a new means by which to represent domain knowl-

edge for scheduling problems. The case-base consists of examples of actual instances

of expert reasoning. The episodic nature of cases provides a flexible framework for

eliciting domain knowledge from experts in a natural and intuitive fashion. In par-

ticular, the need to find broader patterns in the reasoning behaviour of experts is

removed by the direct application of stored experience to future problem solving.

9. conclusions 209

The CABAROST system described in thesis represents rostering domain knowl-

edge using case-based reasoning. Of course, some of the domain knowledge still resides

in the definitions of the decision variables and of the customisable constraint types.

But an attempt has been made to limit these to representations of static domain

knowledge. The knowledge about how the constraints and instantiations of decision

variables interact to produce satisfactory solutions is collected and reused through

the case-base. We have shown that it is possible to accurately imitate the decisions

of rostering experts by comparing the decision produced by CABAROST with those

made by rostering experts. Furthermore, the high classification accuracy achieved by

CABAROST once trained demonstrates that this form of knowledge representation

is both valid and successful.

When CABAROST is used in conjunction with meta-heuristic approaches the

situation becomes more mixed. Some knowledge is represented within the meta-

heuristic’s objective function. However, in the approaches described in this thesis

this is kept to a minimum, and in general such objective functions are used to merely

guide the search process towards rosters which violate few constraints. Importantly,

the results of experiments on the tabu search and memetic algorithm hybrids show

that it is possible to solve problems successfully without relying solely on the definition

of the objective function.

9.1.4 Adapting the Domain Model

Most real-world problems are not static in nature. This is particularly true of per-

sonnel rostering problems in modern organisations, where management requirements

change rapidly to react to changes in the types of service that must be provided.

Automated rostering methods are generally poor at adapting to these changes, often

due to the highly instance-specific nature of the problem solving knowledge used.

Significant progress has been made towards designing systems which allow users to

change static problem knowledge, such as staff characteristics, planning periods, and

shift types. But there is still a need for methods which allow more flexibility for

representing reasoning knowledge.

9. conclusions 210

CABAROST provides a more adaptable model for representing personnel roster-

ing problems. At the most basic level CABAROST can be adapted to changes in

the problem domain through re-training - either starting from an empty case-base,

or by defining the new constraints and training the case-base specifically for these.

The modular nature of the problem representation allows decision makers to remove

cases which are no longer valid. These could be individual cases representing one-off

decisions. Alternatively, they could be groups of cases pertaining to a particular con-

straint type which is no longer used. This entire process is facilitated by the intuitive

knowledge acquisition tools which case-based reasoning methods naturally provide.

The memetic algorithm described in Chapter 8 provides further scope for problem

adaption, in more of an ‘on-line’ sense. The thresholds on the acceptable distance

of generated repairs from those retrieved from the case-base can, over time, act as

an automatic method for detecting a shift in the nature of the problem domain. In

circumstances where the threshold is exceed CABAROST interactively requests new

domain knowledge in the form of cases. As a result, any gaps in the knowledge stored

in the case-base which have appeared due to changes in the problem domain are filled.

The ability of the case-base to learn from failure also helps CABAROST to adapt.

Cases which start to fail regularly may do so because they are no longer valid for new

problems. If this happens often enough then these cases will naturally fall out of

use, and can be removed from the case-base once their penalty weight increases to a

pre-defined level.

9.1.5 Reacting to Unexpected Events

In organisations such as hospitals the absence of staff members due to unexpected

illness or unplanned leave can have a severely detrimental and even dangerous effect

on the quality of service provided. Sudden increases in the demand can also require

that more staff are assigned on-shifts. Quick manual changes to working rosters are

often made whereby other staff members are called in to work on a ‘last-minute’

basis, or temporary staff are borrowed from floating pools. This may be acceptable

if it happens very rarely, but regular adjustments to rosters can cause violations to

9. conclusions 211

longer term temporal constraints which will often remain unresolved. Such violations

have a negative effect on the morale of the remaining staff members.

CABAROST provides a decision support tool which can help nursing managers re-

solve such staffing crises. The removal of a staff member from the roster will inevitably

produce violations of coverage constraints, and CABAROST will be able to suggest

repairs for these based on the expert’s previous rostering decisions. Any further vio-

lations, generated as a result of the repair of the coverage violation, can be dealt with

subsequently either interactively or through the use of a hybrid CABAROST/meta-

heuristic approach.

9.2 Method Evaluation

The CABAROST system is significantly different to the approaches discussed in the

literature. Many of the qualitative operational advantages of the system have been

discussed in the previous sections. In this thesis the performance of the system

on a set of real-world rostering problems from a UK hospital have been measured.

CABAROST proved to be capable of imitating the rostering decisions made by nurs-

ing managers reliably. Additionally, hybrid meta-heuristic methods incorporating

CABAROST generated repairs out-performed those which make use of random re-

pair generation strategies for all of the QMC data. It is clear that the CBR-based

methods described in this thesis are capable of solving a complex real-world rostering

problem.

At the time of writing this thesis the CABAROST method is being implemented

at the Beaumont Hospital in Dublin, Republic of Ireland. This hospital is in the

process of implementing an institution-wide, web-based, rostering database for the

wide variety of different rostering problems in different wards. The CABAROST

method will be incorporated with this system, initially as an interactive decision

support system for the nurse managers in each ward. Later, the hybrid meta-heuristic

methods will be implemented to allow central management to automatically generate

rosters using the case-bases trained by the ward managers.

A great deal of work has been carried out to make sure that the problem model is as

9. conclusions 212

general as possible through the analysis of the problem descriptions of many research

papers in the literature. Collaborative research was carried out with researchers at

Kaho St. Lieven, in Gent, Belgium, to determine the portability of problems between

the CABAROST based system, and the successful commercial nurse rostering system

ANROM [52, 53, 54, 55, 235], which is in use in many Belgian hospitals. In general, it

was possible to model problems using both systems. However, the following problems

were discovered with regards to comparing the performance of the systems away from

the domain experts who defined the problems:

- Domain Knowledge. Reasoning knowledge is represented largely in the case-base

in CABAROST, but largely in the objective function in ANROM. Transferring

knowledge between these two representations was highly problematic - it was

essentially equivalent to the knowledge elicitation bottle-neck described earlier

for problem modelling in general. Consequently, the models created by the

use of each system for the other’s problems were heavily influenced by the

interpretation and bias of the researchers.

- Assessment of Roster Quality : No standardised method exists for comparing the

quality of rosters produced by different methods. In ANROM the objective

function represented the complex interactions and relative importance of the

constraints. Many of these problem characteristics are represented in the case-

base in CABAROST, reducing the objective function used by the meta-heuristic

hybrids to a guiding role.

Comparing meta-heuristic methods for rostering and other scheduling problems

has traditionally been a simple matter of applying them to problems using the same

objective function to measure quality. However, this approach is not possible when the

objective functions used by methods are necessarily different for the same problems.

In fact, to argue that these are ‘the same problems’ is certainly erroneous in a strictly

mathematical sense. Clearly, real-world problems can be represented in many ways

depending on the solution approach adopted.

Consequently, two research tasks traditionally carried out by operational researchers

9. conclusions 213

cannot be performed here. The application of CABAROST to problems from the lit-

erature was not possible because domain experts were not available to provide the

relevant case-by-case examples of their rostering behaviour. Likewise, other methods

from the literature could not be directly applied to the problem at the QMC, which

was represented by the knowledge contained in the case-base. The process of ap-

plying these methods to ‘translated’ versions of the problems would re-introduce the

modelling deficiencies which CABAROST was engineered to avoid, and the solutions

produced by the methods could not be compared to those produced by CABAROST

due to incompatibilities in the objective functions.

This problem of accurate comparison of dissimilar methods needs to be urgently

addressed by the academic scheduling community. A detailed statistical analysis of

solutions to determine their quality, perhaps in a multi-objective sense, could be one

way forward. As a first step, a realisation of the fact that blindly applying identical

objective functions for use in different methods is not a valid basis for comparison is

essential.

9.3 Applicability to Other Domains

The case-based reasoning principles used in the CABAROST method could be applied

to many scheduling and other combinatorial optimisation problems. To facilitate the

development of such methods it is useful to identify the key objects and processes

which must be defined. These are:

- Constraints/Violations : The definition of constraints is a concept common to most

methods for solving combinatorial optimisation problem. To apply case-based

reasoning it would be necessary to represent the structural characteristics of

both constraints and their violations. This would include information such as

the type of constraint, the type of domain objects involved (e.g. jobs, resources,

routes), and the requirements dictated by the constraint in terms of those ob-

jects.

- Repair Operations : The operations used to repair constraint violations must be de-

9. conclusions 214

fined. These may be elicited from domain experts by determining how manual

changes to solutions are made. Alternatively, they may be determined through

analysis of the mathematical characteristics of the problem. Again, the struc-

tural features of these repairs must be determined in terms of the domain ob-

jects.

- Generalisation: Key to the success of a case-based reasoning system would be the

translation of domain object characteristics into objects suitable for storing in

the case-base. To do this, objects must be abstracted out of their particular

problem instance by, for example, removing resource identification labels. One

principle must be adhered to when designing object generalisations, namely

that no representation of a case should be applicable to only one instance of a

problem.

- Feature Indices : After the structural characteristics of domain objects have been

identified the feature index set must be established for each type of constraint

violation. Initially, a large set of features could be identified through analysis

of the problem and interviews with domain experts. In order to reduce the size

of this large set to computationally feasible number of features, and to ensure

that the features are relevant, the feature selection and weighting algorithm

from Chapter 6 could be used. This algorithm would not need to be adapted

for new problems - it could be used exactly as it is described in this thesis.

Alternatively, other methods for improving classification accuracy of case-bases

by reducing the size of the feature space could be employed.

- Repair Adaptation Rules : The cases retrieved from the case-base must be adapted

to the context of the problem instance being solved. In particular, the gen-

eralised objects used to describe violations and repairs in the case-base must

be combined with the information from the violation being solved in a sensible

way. Adaptation rules must be designed so that they produce repairs that will

have a measurable effect on the constraint violation in the problem instance,

but they must not be too restrictive in their interpretation of the cases.

9. conclusions 215

A case-based method which addresses these five issues could use the retrieval and

adaptation algorithms described in this thesis with very few alterations. The devel-

opment of such methods should be the focus of future research into the application

of case-based reasoning to combinatorial optimisation problems.

9.4 Future Work

This thesis has described an exploratory attempt at applying case-based reasoning

methodology to the problem of producing personnel rosters which has produced some

successful outcomes. However, a large number of the different technologies which can

be used to build CBR systems would be very applicable to rostering problems and

should be explored. In this section some suggestions for promising research directions

are described.

A more systematic approach to case-base maintenance should be developed to

ensure that the quality, consistency, and coverage of the case-base are optimal. It

may be necessary to develop a quantitative measure of the success of a repair for a

constraint violation in order to provide a direction for the improvement of solution

quality. More complex representations of case failure could be used to purge the

case-base of erroneous cases, perhaps by interactively allowing the decision maker to

define failure from their point of view.

More constraint types could be added to CABAROST increasing its applicability

to a wider range of problems. In particular, constraints which take into account his-

torical data should be implemented. The mathematical representation of constraints

could be extended to better represent the subjective nature of personnel rostering

through the use of fuzzy reasoning. One obvious area for the application of fuzzy rea-

soning is for coverage constraints. Fuzzy sets could be used to represent the expert’s

minimum, desired, and maximum preference for the number of staff needed at any

time. Such definitions could be extended to allow for fuzzy representation of violation

similarity, and of the distance between generated and retrieved repairs.

One of the most promising directions for future research is the hybridisation of

CABAROST, and other case-based reasoning approaches in general, with traditional

9. conclusions 216

meta-heuristic and constraints satisfaction methods. CBR provides an excellent tool

for increasing the level of abstraction of methods for solving combinatorial optimisa-

tion problems. In particular, using CBR to help meta-heuristic methods avoid local

optima in the solution space could greatly improve their speed and reliability.

9.5 Dissemination

The research described in this thesis has been disseminated through conferences and

publications in both the artificial intelligence and operational research fields. Follow-

ing is a list of the publications and conference abstracts which have been produced.

9.5.1 Journal Papers

[37] G R Beddoe and S Petrovic. Selecting and weighting features using a genetic

algorithm in a case-based reasoning approach to personnel rostering. Accepted

for publication in the European Journal of Operational Research, November

Cases in the case-based approach to nurse rostering consist of descriptions of

constraint violations and the repairs that were used to solve them. Constraint

violations are represented by sets of feature indices, which describe their char-

acteristics which are relevant to the type of problem solving which they will

be used for. Similarity between violations (and therefore cases) is measured

using a Euclidean distance function which is weighted according to the relative

importance of each feature index. In this paper, the problem of automatically

selecting and weighting feature indices is investigated. A genetic algorithm is

developed which simultaneously searches for good quality feature selections and

weightings according to the classification accuracy of the case-base. This algo-

rithm significantly increases the performance of the case-base and reduces the

memory requirements per case by removing irrelevant and erroneous feature

indices. The results also provide an insight into the nature of manual nurse ros-

tering, by highlighting the roster characteristics which are used when making

9. conclusions 217

rostering decisions.

[35] G R Beddoe and S Petrovic. Case-based reasoning for combinatorial optimisa-

tion problems. Submitted to Artificial Intelligence, November 2004;

In this paper the case-based approach is discussed in terms of its applicabil-

ity to the wider field of combinatorial optimisation problems, in particular to

those which can be modelled as constraint optimisation problems. The paper

focuses on three key issues: (a) the generalisation and abstraction of domain

knowledge to ensure cases are applicable to a wide variety of potential future

problems; (b) choosing the order in which to repair constraint violations; and

(c) learning from the failure of reasoning episodes; (d) interactive training of the

case-base during meta-heuristic search. A memetic algorithm is described which

is a hybridisation of a genetic algorithm and the case-based repair generation

system. This evolutionary approach generates populations of repair sequences

from the case-base and then searches for the optimal ordering of the repairs

using crossover, mutation, and selection operators. At each generation a small

number of the remaining constraint violations in the roster are repaired, thus in-

creasing the length of the repair sequences. Learning from failure is addressed

by monitoring the repair sequences for constraint violations which reappear.

When a constraint violations reappears in a roster the case which was origi-

nally used to solve it is deemed to have failed. This failure is represented in

the case-base by a case-weighting strategy, which penalises poorly performing

cases. The case-base can be continuously trained throughout the use of the

memetic algorithm by employing acceptance thresholds which detect when the

case-based repair generation can not confidently produce repairs. The user can

be prompted to verify or correct the repairs generated throughout the search

process. The memetic algorithm is tested on a series of monthly data from the

QMC. The results show that the use of case-failure weighting and interactive

training strategies increases the quality of the produced rosters over time.

[36] G R Beddoe and S Petrovic. Combining case-based reasoning with tabu search

9. conclusions 218

for personnel rostering problems. Submitted to Annals of Operations Research

special issue on Personnel Scheduling and Planning, January 2004;

This paper investigates the use of case-based reasoning to improve the perfor-

mance of meta-heuristic algorithms. A basic framework for iteratively solving

constraint optimisation problems is presented. A series of algorithms are de-

scribed which use combinations of the following operators: case-based or random

repair generation, tabu lists, and objective functions. The results of experiments

on the data from the QMC show that it is possible to model significant domain

requirements by representing it as episodic knowledge in the case-base. This

reduces the need to describe these requirements through definitions of objec-

tive functions alone. The algorithms which used case-based repair generation

performed significantly better than those which generated repair to constraint

violations randomly. The use of tabu lists and objective functions complimented

the case-based repair generation methods, resulting in an algorithm capable of

producing good quality nurse rosters with very few remaining constraint viola-

tions.

[33] G R Beddoe, P De Causmaecker, S Petrovic, G Vanden Berghe Comparison

of methods for nurse rostering. In preparation for submission to Journal of

Scheduling;

[189] S Petrovic, G R Beddoe, and G Vanden Berghe Case-based reasoning in em-

ployee rostering: learning repair strategies from domain experts. In preparation

for submission to Applied Artificial Intelligence;

9.5.2 Conference Papers

[34] G R Beddoe and S Petrovic A novel approach to finding feasible solutions to per-

sonnel rostering problems. In Proceedings of the 14th Annual Conference of the

Production and Operations Management Society (POM), Savannah, Georgia,

United States, April 2003;

The problem of finding rosters which do not violate any constraint violations

9. conclusions 219

is investigated in this paper. The case-based methods for repairing constraint

violations in rosters are described. These methods can be applied to rosters

iteratively in an attempt to solve every violation present. A number of iterative

algorithms are presented which combine the case-based repair generation with

the meta-heuristic concept of a tabu list. The results of experiments from the

QMC show that algorithms which use case-based repair generation perform

significantly better than those which generate such repairs randomly.

[188] S Petrovic, G R Beddoe, and G Vanden Berghe Storing and adapting repair ex-

periences in employee rostering, Practice and Theory of Automated Timetabling

IV - Selected Papers from PATAT 2002, Lecture Notes in Computer Science Se-

ries LNCS2740, Springer-Verlag, 2003, ISBN 3-540-40699-9, pp 149-166;

This paper describes a mathematical framework for repairing constraint viola-

tions in personnel rosters using case-based reasoning. Retrieval and adaptation

algorithms are described in detail. The paper describes the notion of violation

and repair similarity using a weighted similarity measure. Experiments on the

QMC data showed that the case-base performance improves as the number of

cases it contains increases. Some initial results of experiments with variable

weighting are described, which give some insight into the types of roster char-

acteristics which are used when making rostering decisions.

9.5.3 Abstracts

- CORS/INFORMS Joint International Meeting 2004, Banff, Alberta, Canada, 16th

- 19th May,2004. G R Beddoe, T E Curtois, P De Causmaecker, S Petrovic, G

Vanden Berghe. A Model for the Nurse Rostering Problem

- EURO/INFORMS Joint International Meeting 2003, Istanbul, Turkey, 6th - 10th

July, 2003. G R Beddoe and S Petrovic. Determining Feature Weights in a

Case-Based Reasoning Approach to Personnel Rostering

- 4th International Conference on the Practice and Theory of Automated Timetabling,

KaHo St.-Lieven, Gent, Belgium, 21st-23rd August, 2002. S Petrovic and G R

9. conclusions 220

Beddoe and G Vanden Berghe. Storing and adapting repair experiences in per-

sonnel rostering

- 16th Triennial Conference of the International Federation of Operational Research

Societies, University of Edinburgh, Edinburgh, UK, July 7th-12th, 2002. S

Petrovic and G R Beddoe and G Vanden Berghe. Case-based reasoning in

employee rostering: learning repair strategies from domain experts

- EPSRC Inter-disciplinary Scheduling Network PhD Student Workshop on Schedul-

ing, University of Nottingham, 8th April, 2002. G R Beddoe. A case-based

reasoning approach to personnel rostering

Appendix A

Violation Feature Indices

Table A.1: Cover constraint violation feature indices

1. Violation magnitude2. Number of assigned hours on the day of the viola-

tion (all nurses)3. Number of assigned hours on the day of the viola-

tion (NurseType)4. Number of unassigned hours on the day of the vi-

olation (all nurses)5. Number of unassigned hours on the day of the vi-

olation (NurseType)6. Qualification cover array for UNASSIGNED shifts

on the day of the violation7. Qualification cover array for OFF shifts on the day

of the violation8. Specialty training cover array for UNASSIGNED

shifts on the day of the violation9. Specialty training cover array for OFF shifts on

the day of the violation10. Specialty training cover array for UNASSIGNED

shifts for 4 days around the violation11. Specialty training cover array for OFF shifts for 4

days around the violation

a. violation feature indices 222

Table A.2: HardRequest constraint violation feature indices

1. Number of assigned hours on the day of the viola-tion (all nurses)

2. Number of assigned hours on the day of the viola-tion (NurseType)

3. Number of unassigned hours on the day of the vi-olation (all nurses)

4. Number of unassigned hours on the day of the vi-olation (NurseType)

5. Shift pattern for the nurse for 4 days around theviolation

6. On-off pattern for the nurse for 4 days around theviolation

7. Qualification cover array for UNASSIGNED shiftson the day of the violation

8. Qualification cover array for OFF shifts on the dayof the violation

9. Specialty training cover array for UNASSIGNEDshifts on the day of the violation

10. Specialty training cover array for OFF shifts onthe day of the violation

11. Specialty training cover array for UNASSIGNEDshifts for 4 days around the violation

12. Specialty training cover array for OFF shifts for 4days around the violation

Table A.3: MaxDaysOn constraint violation feature indices

1. Violation magnitude2. Percentage of assigned hours over the period of the

violation (all nurses)3. Percentage of assigned hours over the period of the

violation (NurseType)4. Shift pattern for the nurse over the period of the

violation5. On-off pattern for the nurse over the period of the

violation6. Average qualification cover array for UNAS-

SIGNED shifts over the period of the violation7. Average qualification cover array for OFF shifts

over the period of the violation8. Average specialty training cover array for UNAS-

SIGNED shifts over the period of the violation9. Average specialty training cover array for OFF

shifts over the period of the violation

Table A.4: MaxHours constraint violation feature indices

Table A.5: MinDaysOn constraint violation feature indices

Table A.6: MinHours constraint violation feature indices

Table A.7: SingleNight constraint violation feature indices

1. Percentage of assigned hours over the period of theviolation (all nurses)

2. Percentage of assigned hours over the period of theviolation (NurseType)

3. Shift pattern for the nurse over the period of theviolation

4. On-off pattern for the nurse over the period of theviolation

5. Qualification cover array for UNASSIGNED shiftsover the period of the violation

6. Qualification cover array for OFF shifts over theperiod of the violation

7. Qualification cover array for NIGHT shifts over theperiod of the violation

8. Specialty training cover array for UNASSIGNEDshifts over the period of the violation

9. Specialty training cover array for OFF shifts overthe period of the violation

10. Specialty training cover array for NIGHT shiftsover the period of the violation

Table A.8: SoftRequest constraint violation feature indices