a comparative evaluation of metaheuristic approaches to ...813863/fulltext01.pdf · automated...

A Comparative Evaluation of Metaheuristic Approaches tothe Problem of Curriculum-Based Course Timetabling

Daniil Bogdanov

Bachelor’s Thesis at KTH, Royal Institute of TechnologySchool of Computer Science and Communication

Supervisor: Pawel HermanExaminer: Mårten Olsson

May 21th, 2015

Abstract

Timetabling is an active area of research and used in a wide range of applications. As thedevelopment of most of these applications is on its way towards automation, the need forautomated timetabling increases. Despite many years of research and development of auto-mated approaches, solving NP-hard problems such as timetabling problems remains a challenge.Metaheuristic-based approaches to these problems are constantly being refined and further de-veloped as the complexity of these applications increases. But despite the increase in complexity,the time it takes for these algorithms to solve these problems is constantly being challenged.

While this thesis covers the fundamentals in metaheuristic approaches to the problem oftimetabling, its main focus is to compare how two known metaheuristic algorithms, Tabu Searchand Simulated Annealing, perform across different scales of resources that are to be scheduled.To attempt fairness, similar implementations of these two algorithms were made in order toeliminate systematic biases. For each set of resources the algorithms solves a timetabling prob-lem under a limited amount of time and computational capacity. The collective quality of allthe produced timetables were compared. The results show that Simulated Annealing performslightly better in the majority of the instances but with little margin for the collective quality ofall tables. Despite trying to set a common ground for the these similar metaheuristic approaches,the underlying difficulties in comparing algorithms are discussed.

1

Sammanfattning

Schemaläggning är ett aktivt forskningsområde och har ett stort antal tillämpningsområden. Dåutvecklingen av de flesta av dessa tillämpningar är på väg mot automatisering, ökar behovet avautomatiserad schemaläggning. Trots många års forskning och utveckling av automatiseradetillvägagångssätt, är det fortfarande en utmaning att lösa NP-svåra problem såsom schemaläg-gningsproblem. Metaheuristiska metoder som löser dessa problem förfinas ständigt och vi-dareutvecklas i takt med ökande komplexitet i de tillämpningar dem löser. Men trots den ökadekomplexiteten utmanas ständigt tiden det tar för dessa algoritmer att lösa dessa problem.

Då denna avhandling behandlar grunderna i metaheuristiska tillvägagångssätt till schemaläg-gningsproblem, är dess huvudsakliga fokus att jämföra hur två kända metaheuristiska algorit-mer, Tabu Search och Simulated Annealing, presterar vid olika skalor av de resurser som skallschemaläggas. För att göra jämförelsen rättvis, implementerades dessa två algoritmer likartatvilket syftar till att eliminera systematiska fel. För varje uppsättning av resurser löser algo-ritmerna ett schemaläggningsproblem under en begränsad tid och beräkningskapacitet. Denkollektiva kvalitén hos de producerade tidtabellerna jämförs. Resultaten visar att Simulated An-nealing presterar något bättre i de flesta av fallen men med lite marginal sett till den kollektivakvalitén hos respektive algoritm. Trots försök att fastställa en gemensam grund för dessa lik-nande metoder, diskuteras de underliggande svårigheterna i att jämföra algoritmer.

2

Acknowledgements

I would like to thank Helena Sandgren at the Royal Institute of Technology for her cooperationand expertise on how scheduling is done at her university. It gave me insights on the complexityof these problems and how real world timetabling problems are being solved today.

I especially want to thank Pawel Herman at the Royal Institute of Technology, Dept. ofComputational Biology, for supervising me through the work of this thesis. I have learned a loton the subject of academic work thanks to his guidance. As he has been the only one familiar tomy work, the received feedback has been vital in writing this report.

3

Contents

1 Introduction 5

2 Background 62.1 Curriculum-based course timetabling . . . . . . . . . . . . . . . . . . . . . . . 62.2 Timetabling at the Royal Institute of Technology . . . . . . . . . . . . . . . . 62.3 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.4 Metaheuristic algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.5 State of the art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.6 Complexity theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3 Problem description 9

4 Method 94.1 Definitions and notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94.2 Quality of timetables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

4.2.1 Hard constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114.2.2 Soft constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

4.3 Tabu Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124.4 Simulated Annealing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

4.4.1 Acceptance function . . . . . . . . . . . . . . . . . . . . . . . . . . . 144.4.2 Temperature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

4.5 The standard setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154.6 Variation of parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164.7 Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164.8 Finding neighboring timetables . . . . . . . . . . . . . . . . . . . . . . . . . . 174.9 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

5 Evaluation 17

6 Discussion 236.1 Control parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

6.1.1 Temperature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246.1.2 Tabu list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

6.2 Finding neighboring timetables . . . . . . . . . . . . . . . . . . . . . . . . . . 246.3 Cost parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

7 Conclusion 25

4

1 Introduction

Automated resource planning has a wide range of applications in areas such as universities,high schools, sports, employment, hospitals etc. Anywhere where there is a need for order andstructure in both aspects of time and space, a desire for a well-thought planning is most relevant.More resources have to be spent if the planning fails to execute at the right time and place thanthe cost of planning itself. This is why having a reliable and accurate planning is desired. Whilethe essence of planning has been somewhat static over time, the areas in which this planningis needed have become more complex. But with an increasing computation power of today’scomputer and the development of sophisticated algorithms, one can hope that this complexitycan be handled in a reasonable way.

While resource planning is used in a wide range of applications, the purpose of planning canbe many. For instance, hospitals is in need for resource planning when scheduling the resourcesof human workflow, both for employees but also for the flow of patients. Resource planningis also used within companies when constructing a timetable for employers and their workingschedule. And having an automated planning system could help to handle salaries of each em-ployee since the system keeps track of how many hours each employer works. For universitiesa timetable for all students has to be created at least once every year. While this is still man-ually done in many universities, ways of trying to automate this work has long been an activesubject [1]. Over the past decades there have been a large interest in applying metaheuristicalgorithms to university timetabling [2]. Metaheuristic approaches are in their nature general-purpose algorithms which can be applied to problems which have a great deal of variation suchas timetabling.

Timetabling problems can be regarded as high-dimensional, non-Euclidean, multiconstraintcombinatorial optimization problems [3]. But despite of knowing what types of categories theseproblems fall into, they still lack a formal definition. This is because it is difficult to have ageneral formulation that suits all cases. Institutions have their own definition of the problemdepending on the area of investigation and the nature of the problem. This makes the field ofresource planning hard to systematically develop further [4]. But it should not be retained fromevaluating and comparing metaheuristics to gain a better understanding of how algorithms thatsolve these problems perform under various conditions and restrictions.

Efforts in standardizing timetable problems through International Timetabling Competitions(ITC) confirms that there is an awareness in the community for such needs [4]. The aim for thesecompetitions has been to create a common ground for comparing algorithms on standardizedbenchmarking tests. Formulating problems which covers the fundamentals in timetabling hasbeen one way of setting a common ground in this area. Benchmarking results which comparemetaheuristic approaches shows which algorithm perform well in specific tests, but not alwaysin varying circumstances. Projects aiming to focus on comparing and analyzing performancesin varying conditions such as Metaheuristic Network (MN) has put efforts in creating a commonground for comparisons [5]. Analyzing metaheuristics under these conditions and exposing themto different environments rather than to static standardized tests only might help to understandthe key factors of when and why some algorithms perform better than others.

5

2 Background

In essence, timetabling problems consists of assigning a number of events, each having their ownfeatures, to a limited number of resources subject to certain constraints [6]. The solution stronglydepends on these constraints since they define the outlines of the problem. Verifying that atimetable indeed solves the problem requires one to understand how the constraints reflects uponthe involving resources. Different constraints give rise to different conclusions regarding whichapproach to the problem is the best one. This is why making comparisons between solutionsregarding efficiency and accuracy for problems with different constraints may give differentresults if the problems are defined on equal terms.

2.1 Curriculum-based course timetabling

Timetabling within universities can be divided into two categories depending on the course en-rollment system. In some universities students are obligated to pick their own courses and inothers universities there are curriculums which have predetermined courses that the studentsenroll. The two different systems is characterized as Course-Based Timetabling (CTT) andCurriculum-Based Course Timetabling (CB-CTT) respectively. Despite the enrollment system,there is a great need for automated timetabling systems in both cases [1]. While some uni-versities already have automation in their work, there is still a need for manual aid in order toconstruct a high quality table. The main problems with automated systems is that they must beable to generate high-quality timetables despite the huge variation in constraints and resourcesthat schedulers use to construct a table [7]. They must be easy to use and include all functionsneeded to generate high-quality timetables. Since each university have their own constraints tosatisfy, that may vary over time, and different resources to schedule, both commercial packagesand self-developed automated systems often fail to provide all the required functions to have afully automated timetabling system.

2.2 Timetabling at the Royal Institute of Technology

The Royal Institute of Technology (KTH) falls under the category of CB-CTT. A timetableis created for each semester and is based on the courses each curriculum offer their students.Typically there are twenty possible events each week for a curriculum to schedule their classes,two before noon and two after, each being 2 hours long and having fixed starting hours. Theclasses start at 8 am, 10 am, 1 pm and 3 pm. Some classes such as laboratories require morethan 2 hours but still start at the same fixed starting hours. The courses that are scheduled after5 pm are considered evening courses and is often outside the portfolio of the mandatory coursesoffered by curriculums.

Each course mainly consists of theoretical lectures and practical classes. Occasionally somecourses have laboratories, tests or seminaries. These special events usually bring about addi-tional constraints since they require that the students has been taught the material beforehand.Therefore they cannot be scheduled as any other event and has to be handled with caution.

The rooms in which the classes are held in varies depending on the type of class. Biggerhalls that have the capacity to fit all students for a given course is used for theoretical classes.

6

Professor give theoretical lectures and because his or her time is valued and they are a limitedresource to universities, theoretical classes are never divided into smaller groups. On the otherhand, practical classes are usually divided into smaller groups and require more classroomsfor each class. Here students from a higher academic year often acts as teachers and give thepractical classes.

Two timetables are constructed each year, one for each semester. A large set of courseswhich is acquired through a database is used as a foundation to the upcoming timetable. Addi-tional information for each course may be found in this database which often has to be handledmanually. Information such as different types examination modules, requests and preferencesfrom the lecturer and in general how the structure of the course is thought to be given. Theserequests vary a lot from course to course and is one of the reasons why scheduling at KTH witha fully automated timetabling system is still not yet in use [7].

Bigger universities often have a more complex scheduling system since more instances areinvolved and hence making it more challenging. But despite the size of universities, there aresome features and constraints that almost always are present in timetabling problems.

2.3 Constraints

Constraints can be divided into two parts, hard constraints and soft constraints. If a hard con-straint is violated, the timetable cannot be accepted as a feasible solution. This is because theconstraints are usually physically impossible to ignore and must therefore be highly prioritized.The soft constraints are less forbidding and will therefore be less prioritized but ideally youwant to violate as few constraints as possible in order to find the best possible solution. Softconstraints usually arise as preferences from different resources that are to be scheduled. Softconstraints may be defined in such a way that they work against each other. One has to thenspecify which constraint impacts the quality of the timetable more and optimize the timetable inthat regard.

2.4 Metaheuristic algorithms

Metaheuristic algorithms may be applied to a wide variety of problems since they are not prob-lem specific. Their successful comes from the fact they cleverly generate solutions to prob-lems that are often hard to solve for exact solutions. They combine heuristic methods aimingto efficiently and effectively explore the solution space of a problem. The solution space fortimetabling problems consists of the set of all possible combinations one can make in order toconstruct a table. The shape of the solution space depends on an objective function which has tobe defined for each problem. This function determines the quality of the tables and affects howmetaheuristics explore the solution space.

The generation of new solutions is done either by exploring the solutions space locally orconstructing a solution from scratch by adding components until a complete table is generated.These ways of finding solutions are referred as local search and constructive methods respec-tively. Some local search methods starts to explore the solution space from an initial solutionwhich is often generated randomly or in a greedy way. Algorithms which uses this approachare classified as trajectory based. While these approaches may quickly generate approximate

7

solutions, accuracy is traded for speed. But since methods of finding exact solutions are hard toconstruct due to the complexity of the problem, metaheuristics are often a good first approach tothese problems.

Intensification and diversification are two terms often used in the fields of metaheuristics.Intensification is often referred to as the ability to avoid getting trapped in confined areas in thesolution space. And diversification is referred as the ability to explore new areas and quicklyidentify regions of the solution space which bring about high quality solutions. While bothaspects are vital to these algorithms, they may sometimes work against each other [8]. It istherefore of importance to find a balance between them to achieve optimal performance.

2.5 State of the art

There are many approaches to the problem of timetabling and methods has been hybridized tooptimize the solution method of specific timetabling problems. New approaches to metaheuris-tic algorithms has in recent time been developed to specific timetabling problems and shownpromising results.

Generic algorithms which has its inspiration from Darwinian evolution has been a hot topicto researches in the field of timetabling. These algorithms represent the timetable in a long en-coded string just as DNA encoding. Mutations are done on these strings in the form of operatorsto find better timetables. These operators come in a variety of forms and has long been studied tooptimize specific problems. These operators bring diversity into the generation of new solutionand has increased the adaptiveness in many applications. Today they successfully timetablescourses at the University of Edinburgh, the Harvard Business School, Kingston University andseveral other institutions [1].

Although metaheuristic algorithms such as generic algorithms has shown to perform wellin standardized benchmarking tests such as tests at ITC, other approaches has also made it tothe spotlight. For problems which have many constraints that has to be handled, boolean sat-isfiability (SAT) if a preferred way of solving them. For problems where the resources can bedivided into smaller groups and assigned independently from each other, a two-stage IntegerLinear Programming (ILP) has shown to perform well [9].

Other popular algorithms being used in today’s benchmarking tests are Memetic Algorithms,Constraint Logic Programming (CLP), Tabu Search (TS) and Simulated Annealing (SA) to namea few.

2.6 Complexity theory

Timetabling has long been known to belong to the complexity class of problems called NP-hard (None-deterministic Polynomial time hard) [9]. Deterministic problems (P-problems) arecharacterized as such; if a solution is given, one can check (in determined polynomial time) ifthis solution is a valid one. Also algorithms that finds these solutions are also well understood.NP-problems are characterized as being easy to check the validity of a solution if one is given,just as P-problems, but algorithms that find these solutions are hard to formulate. Finally NP-hard problems are both hard to both check the validity of a solution and finding algorithms thatfinds these solutions. Algorithms which is used to solve NP-hard problems has not yet been

8

formulated such that a solution is found in a reasonable (polynomial) amount of time. Wouldsuch formulation be found, it could solve most NP problems since the essence of these problemscan in a mathematical sense be translated into one another. The question if P-problems reallyare the same as NP-problems has been asked for decades and is known as the P vs. NP problem.This problem is still one of the unsolved Millennium problems.

3 Problem description

Curriculum-based course timetabling problems consists of constructing a timetable by assigningclasses from each course given by curriculums to specific events. While the problem at firstsounds simple, many aspects has to be considered which quickly complicates the problem. Thesolution to timetabling problems is a timetable where every class has been assigned an eventwith no violations to hard constraints.

A synthetic timetabling problem was approached by using two different metaheuristic al-gorithms. The algorithms that were studied were Tabu Search and Simulated Annealing. Theaim of this investigation were to compare how well these algorithms solved for timetables undercertain conditions. A standard setting of the resources were defined as an anchor point for theproblem and parameters that both scaled up and increased the difficulty of the problem was var-ied. The algorithms solve for timetables under these varying conditions under a fixed amount oftime and computational capacity. The purpose of this problem was to see how these algorithmsperformed at various scales and difficulties under restricted limitations. The restrictions were setto 10 minutes or 10000 iterations. These restrictions were set to see how good these algorithmssolved for harder and harder problems.

Many metaheuristic algorithms solve optimization problems in very different ways, whichcan make comparisons between them biased. TS and SA are two metaheuristics that share manysimilarities. For instance, they both search locally in the solution space and are both trajectorybased. In this way, much of the same processes surrounding the problem were used for bothalgorithms to make the comparison less biased.

The structure of the timetabling problem was inspired from the timetables used at KTH. Butthe resources used was ultimately synthetically crafted to easier manage the variations that wereto be made.

4 Method

4.1 Definitions and notations

Since there is no general formulation of timetabling problems, definitions and notations oftenvary. To make this problem self-contained, definitions and notations of the problem had to bespecified. Throughout this problem the following notations and definitions were used:

Timeslot - T denotes the set of all timeslots. A timeslot is a time period of two hours andeach weekday contain four of them. The timeslots starts at fixed times throughout the day,beginning at 8 am, another one at 10 am, 1 pm and the last one at 3 pm. A whole week of

9

school therefore consists of 20 timeslots where any class can be scheduled on. The size ofthis set is denoted T .

Room - R denotes the set of all rooms. Each class has to take place in a room. All rooms areregarded as equally fit for any class to be scheduled in. Logistics features such as distancesbetween rooms and maximum capacities were not considered in this problem. The size ofthis set is denoted R.

Event - E denotes the set of all events. An event is the composition of a timeslot and a room.The size of this set is therefore T ·R. Every class are to be scheduled to one element inthis set.

Curriculum - C1 denotes the set of all curriculums. Each curriculum has a total of 4 coursesand 2 teachers, responsible for 2 courses each. The size of this set is denoted C1.

Course - C2 denotes the set of all courses. Each course consists of classes which are to bescheduled at different timeslots. Each course has one assigned teacher. Since parametersthat scale the problem were to be varied, the number of classes for each course were setto vary with it to keep the workload for each curriculum constant throughout each week.The size of this set is denoted C2.

Class - C3 denotes the set of all classes. Each class has a duration of two hours, matching atimeslot exactly. Classes come in two different types; theoretical classes and practicalclasses. There is no difference in the way these classes are taught, but used as a featurefor this set. The size of this set is C3.

Teacher - P denotes the set of all teachers. Each teacher is responsible for 2 courses whichbelong to the same curriculum. Teachers may have other matters to attend and thereforehave timeslots which they cannot give classes on. The size of this set is denoted P.

Quantities measuring the ratio of different parameters of this problem was defined to get a betteroverview of the structure of the timetable being scheduled. Ratios such as

Attendance level ρ - The attendance level is a measure of how many classes any given curricu-lum is supposed to give each week. To balance the workload for all students each week,ρ is simply defined as the number of classes each week per curriculum.

Unavailability level σ - The unavailability level is a measure of how often each teacher isunavailable. This ratio was simply defined as the amount of unavailable timeslots perweek per teacher.

Event compactness Ω - The event compactness of a timetable was defined as the ratio betweenscheduled and unscheduled events. Since every class was to be scheduled, the ratio couldbe defined as Ω =C3/E.

Unavailability compactness ζ - The unavailability compactness is similar to σ but normalizedwith the number of timeslots each week. The ratio was defined as ζ = σ/20. ζ = 0 meansteachers are always available and ζ = 1 means they can never teach class.

10

Class compactness η - The class compactness is similar to ρ but normalized with the numberof timeslots each week. The ratio was defined as η = ρ/20. η = 0 means there are noclasses for any student. η = 1 means students have no empty timeslot throughout thetimetable.

4.2 Quality of timetables

In order to say something about the quality of the timetables that these algorithms construct,one has to specify what a good and bad timetable means. Assuming tables do not violate anyof the hard constraints, the table with the lowest cost is the one regarded as the most feasiblesolution. For this task a cost function was used to determine the quality of each table. Thecost function considered all constraints and evaluated how many constraints were violated fora given table. Since violating different constraints have different impacts on the quality of thetable, cost parameters, λc, were associated to constraint c to differentiate their impact. A highercost parameter would contribute with a higher cost if violated and thus favoring this timetableless. The cost function was defined as

C(x) = ∑c

λc ·Λc(x) (1)

where Λc(x) is the total number of violations against constraint c and λc is the correspondingweight of that constraint. The output of the cost function was thus a weighted linear sum of thecost from each constraint and used as a measure of quality. This way of defining the objectivefunction for timetabling problems is one of the more common approaches [2].

While the relation between the cost parameters for both hard and soft constraints affects theconstruction of the timetable, little efforts ware spent to tune these parameters. Higher costswere set to the hard constraints so that these constraints would be prioritized more often than thesoft constraints which were given low costs.

4.2.1 Hard constraints

A solution was regarded as feasible if there were no violations towards any of the hard con-straints. This is the reason for a high cost parameter for each hard constraint. The hard con-straints used in this timetabling problem were the following:

H1: Classes of the same curriculum cannot be scheduled at different rooms at the same times-lot. Violation to this constraint would give the table a cost of λH1 = 100 for each class thatwere scheduled on the same timeslot.

H2: Two classes of the same type, of the same course, cannot occur more than once eachday. Time for the students to prepare for new material is needed and is the reason forthis constraint. Each class that violated this constraint would contribute to the cost withλH2 = 75.

11

H3: Each teacher has unavailable timeslots where classes cannot be given. Each scheduledclass which violated this constraint would contribute with a cost of λH3 = 100.

H4: Obey the maximum amount of classes of each type per week for each course. This con-straint considered the students workload and evened out the load over the whole weekuniformly. Violation of this constraint contributed with a cost of λH4 = 100.

H5: All classes must be scheduled at a distinct time and room. This constraint was automat-ically met since the initiation process of an each table made sure that all classes werescheduled somewhere in the table.

H6: Two classes cannot be scheduled to the same room at the same timeslot. This constraintwas automatically met since each room at any timeslot could only have one class sched-uled at a time because of the implementation of the problem.

4.2.2 Soft constraints

The soft constraints are the ones that should be violated first if any. Ideally a solution should notviolate any constraint but these solutions are often impossible to construct and violations willtherefore occur. The soft constraints used in this timetabling problem were the following:

S1: Minimize consecutive classes of the same type for each course. Theoretical and practicalclasses should be taught at the same phase. For each pair of classes that violated thisconstraint, a cost of λS1 = 2 was added to the cost function.

S2: For each course of the same type, prefer to schedule for the same room. Students oftenprefer to have classes of the same type in the same room. The number of different roomsused for a course of the same type of class were used as the value of the cost. Thisconstraint therefore had a weight of only λS2 = 1.

S3: For each teacher, minimize consecutive classes in any given day. Since each teacher hasmore than one course, this could happen and lecturing for long hours is tiring. For eachconsecutive class scheduled, a cost of λS3 = 2 was added to the cost function.

S4: For each curriculum, even out the classes throughout the week over morning- and eveningtimeslots. Classes starting at 8 am and 3 pm contributed equally much in the oppositedirection. The same was done for classes given at 10 am and 1 pm. The difference in thenumber of classes of these two time periods contributed to the cost function. Classes at 10am and 1 pm contributed with λS4 = and classes scheduled at 8 am and 3 pm contributedwith 2 ·λS4.

4.3 Tabu Search

Tabu Search (TS) is a local search metaheuristic algorithm that utilizes a list called the tabu listwhich contains previously visited tables. When exploring the solution space, the list is usedto make sure that already visited tables are not revisited repeatedly. This is the diversification

12

feature of the algorithm since it prevents the algorithm from being stuck in the same area in thesolution space. The length of the tabu list may vary depending on the implementation. Commondependencies are the cost of the current timetable or the number of iterations performed, butsometimes it has no dependency at all [6].The length of the list will determine how soon thealgorithm may cross an old path later on. Since the size of the list affects the algorithm (con-sider the limiting cases such as no length and infinite length), an optimal size can be determineddepending on the problem [12]. Determination of an optimal length of the list was left outsidethe scope of this report and set to a maximum length of 100. A pseudo code of the algorithm isshown below.

Algorithm 1: Pseudo code of Tabu Search

01 xBest← Initial table02 tabuList← null0304 while (not stop condition met)0506 x0← null07 for (x in NeighborhoodsOf(xBest))08 if ( x not in tabuList and costFunction(x) < costFunciton(x0) )09 x0← x10 end11 end121314 if (costFunction(x0) < costFunction(xBest))15 xBest← x016 end1718 put x0 in tabuList19 if (size of tabuList > allowed size)20 remove first element in tabuList21 end2223 end24 return xBest

4.4 Simulated Annealing

Simulated annealing (SA) has a thermodynamic analogy where feasible solutions are regardedas states of a system. The system in this case is a piece of imperfect material often thought of asa composite metal. The goal is to reduce the defects in this material by minimizing the internalenergy. This is done by heating it up, and overcoming potential barriers in the microscopicstructure and then cooling it down to find lower, more stable energy states. This process iscontrolled by the temperature of the system and repeated until desired property of the metal isreached. Energy in this analogy is the cost of a state and a state is a scheduled timetable. Theanalogy to the heating process is when the algorithm may accept worse solutions when exploringthe solution space. This is the diversification feature of SA and can be seen on line 11 of thepseudo code of Simulated Annealing below.

13

Algorithm 2: Pseudo code of Simulated Annealing

01 xBest← Initial table02 iterations← 003 while (not stop condition met)0405 T← Temperature(iterations)06 bestCandidate← null07 for (x in NeighborhoodsOf(xBest))08 if (costFunction(x) < costFunction(xBest))09 xBext = x10 else11 if (A(x, xBest, T) > RandomInteger from 0 to 1)12 xBext = x13 end14 end15 end16 iterations← iterations + 117 end1819 return xBest

4.4.1 Acceptance function

The acceptance function determines if a worse timetable should be accepted in the process offinding the best solution. This function was defined as to be dependent on the temperature andalso by the cost of the tables which were considered. Other implementations vary the depen-dency, sometimes being independent of the costs of the tables considered [3]. The acceptancefunction in this timetabling problem was defined as

A(x,x0,TSA) = e−∆CTSA = e−

C(x)−C(x0)TSA . (2)

where C(x) is the cost function, x0 is the current solution, x is the candidate solution and TSA isthe temperature (see 4.4.2 an informative description of the temperature). Since (2) is only usedwhen a worse candidate is found, the cost difference in the exponent is strictly positive, ensuringthat 0≤ A≤ 1.

The structure of this acceptance function has its origins from statistical mechanics. (2) canbe viewed as the ratio between two Boltzmann factors. These factors depend on the energy of aparticular state and determines the probability that a system is found in that particular state.

4.4.2 Temperature

While SA explores the solution space in a similar way as TS, it makes instead use of a parameteroften referred to as the temperature, TSA, rather than a list. TSA is a parameter which decreasesin value as the iterations of the algorithm increases. The temperature may also be implementedas to rapidly increase in value to at certain times or iterations, mimicking the process of coolingand reheating. The function that defines the temperature has therefore a significant impact inthe efficiency of the algorithm depending on implementation [11]. Much efforts in finding anoptimal temperature function was not spent since it was not the intentions of this report. Instead,the temperature was defined the following way

TSA(i) = T0eµi (3)

14

where T0 and µ are constants and i is the iteration count. These constants were determined byconsidering probabilities in accepting worse tables in situations regarding the soft constraints.

Initially, the algorithm should accept worse solutions more frequently. And as the cost ofthe current timetable decreases, the algorithm should have a lower probability of moving awayfrom the minimum it is approaching in the solution space. Considering an arbitrary increase inthe cost between two neighboring tables regarding only soft constraints, a typical increase wasobserved to be ∆C ≈ 10. A probability of 10 % was set to be the threshold of accepting worsesolutions in the beginning. Thus, calculating backwards one could compute the constant T0 tobe

TSA = T0 · eµ·0 = T0 → 0.1 = e−∆CT0 → T0 =−

∆Cln(0.1)

≈ 4.3 (4)

The same was done for µ but now considering the probability of accepting worse solutions wheni = imax. Setting an acceptance probability of 1 %, one would get a similar result

0.01 = e−∆C

TSA(imax) → TSA(imax) =−∆C

ln(0.01). (5)

And by using equation (3) with imax = 10000 as argument, we get

TSA(imax) = T0 · eµ·imax → µ = ln(− ∆C

ln(0.01)T0

imax)≈−6.83 ·10−5. (6)

Having µ < 0 insured that the temperature decreased as the iteration count increased.

4.5 The standard setting

The timetabling problem was to be solved for various conditions, it was therefore necessary todefine a standard setting which would act as an anchor point. The standard setting was definedas:

• 240 timeslots. This corresponded to 60 days of scheduling

• 6 curriculums

• 6 available rooms

• 12 classes per week for each curriculum

• 4 unavailable timeslots per week for each teacher

15

Through these values the ratios defined in section 4.1 for the standard settings could be computedas shown in the table below

Ratio ρ σ Ω ζ η

Value 12 4 0.6 0.2 0.6

Table 1: Ratio quantities for the standard setting of the timetabling problem.

Note that C1 = R = 6. This was intentional since each curriculum can only take up 1 room atany given timeslot. But since η is less than 1.0, all 20 timeslots of a week for every curriculumwill not be occupied and thus scheduling for R<C1 was possible and preferred in an economicalpoint of view.

4.6 Variation of parameters

Variations in four different parameters of this timetabling problem were made. T and R scaled Ewhich scaled the problem in size. ρ and σ scaled the difficulty in finding a solution. Increasingρ meant more classes to handle while increasing σ meant fewer possibilities to where assignclasses. The parameters and their variations is presented in the table below:

Parameter minimum maximum interval stepT 120 360 40R 4 8 1ρ 10 20 2σ 2 6 1

Table 2: The table shows the variations in each parameter. These parameters were the num-ber of timeslots T , the number of available rooms R, the number of classes per week for eachcurriculum ρ and the number of unavailable timeslots per week for each teacher σ .

Although the number of timeslots were varied, it was easier to represent this as variationsin days. This variation corresponded to 30 days of scheduling up to 90 days with an intervalstep of 10 days. Also, in the upper limit when varying ρ , critical ratios such as ζ = Ω = 1 werereached. These corresponded to a completely full timetable.

4.7 Initialization

Because both algorithms are trajectory based, an initial timetable had to be generated. Thisgeneration were a mapping between C3 and E . It was done by randomly assigning classes toevents without any consideration to the hard or soft constraints. The process only made sure thatevery element in C3 had a mapping to an element in E . This made sure that constraint H5 wasmet.

16

4.8 Finding neighboring timetables

While there are many ways of bringing about neighboring tables, only one operation that handledthe findings of these neighboring tables was implemented. By randomly picking two distinctelements in E a swap was made between their mappings to C3. Two outcomes could occur. Insome cases one element lost its mapping to the other and in other case they would mutually swapclasses.

4.9 Implementation

The problem was implemented in Java and run on a PC with an Intel Core 3.10 GHz. Bothalgorithms were implemented and for each execution when varying parameters, relevant datawere extracted. Each algorithm was set to solve for a timetable 8 times for a given instance.Stop conditions were set to imax = 10000 iterations or a maximum time limit of tmax = 600 s.

5 Evaluation

The first complication that were observed which can be seen in figure 1-4 were that the hardconstraints were violated in almost every instance. Stop conditions in both computational ca-pacity limit and time limit were set to see if they solved the problems in a limited amount of timeor iterations which they did not. Despite this, comparisons of the performance between thesealgorithms could still be performed.

17

Days30 40 50 60 70 80 90

To

tal co

st

0

1000

2000

3000

4000

5000

6000

TSSA

Days30 40 50 60 70 80 90

Ha

rd c

on

str

ain

t co

sts

0

1000

2000

3000

4000

5000

Days30 40 50 60 70 80 90

Tim

e e

lap

se

d [

s]

0

100

200

300

400

500

600

700

Days30 40 50 60 70 80 90

Ite

ratio

ns

0

2000

4000

6000

8000

10000

12000

Figure 1: The graphs show the mean values of different instances taken when varying the numberof days. The top left graph shows the mean total cost of the timetables that were constructed foreach variation. The top right graph shows the mean hard constraint costs of the timetables thatwere constructed. The bottom left graph shows the total time elapsed in seconds and the bottomright shows the total amount of iterations for each variation. The red horizontal line indicatesthe limit set in this timetabling problem.

Figure 2 indicates that there were no difference with respect to the cost of the tables whenvarying the number of rooms. Although there were some fluctuations in the cost, it may havebeen because of the randomness in the process of generating solutions. Each bar represents themean value of 8 runs which might were not enough to eliminate deviations of these kinds.

18

Rooms4 5 6 7 8

To

tal co

st

0

200

400

600

800

1000

1200

1400

1600

TSSA

Rooms4 5 6 7 8

Ha

rd c

on

str

ain

t co

sts

0

100

200

300

400

500

600

700

Rooms4 5 6 7 8

Tim

e e

lap

se

d [

s]

0

100

200

300

400

500

600

700

Rooms4 5 6 7 8

Ite

ratio

ns

0

2000

4000

6000

8000

10000

12000

Figure 2: The graphs show the mean values of different instances taken when varying the numberof available rooms. The top left graph shows the mean total cost of the timetables that wereconstructed for each variation. The top right graph shows the mean hard constraint costs of thetimetables that were constructed. The bottom left graph shows the total time elapsed in secondsand the bottom right shows the total amount of iterations for each variation. The red horizontalline indicates the limit set in this timetabling problem.

19

Unvailable timeslots2 3 4 5 6

To

tal co

st

0

500

1000

1500

2000

TSSA


Ha

rd c

on

str

ain

t co

sts

0

200

400

600

800

1000

1200


Tim

e e

lap

se

d [

s]

0

100

200

300

400

500

600

700


Ite

ratio

ns

0

2000

4000

6000

8000

10000

12000

Figure 3: The graphs show the mean values of different instances taken when varying the numberunavailable timeslots per week for each teacher. The top left graph shows the mean total costof the timetables that were constructed for each variation. The top right graph shows the meanhard constraint costs of the timetables that were constructed. The bottom left graph shows thetotal time elapsed in seconds and the bottom right shows the total amount of iterations for eachvariation. The red horizontal line indicates the limit set in this timetabling problem.

20

Classes per week10 12 14 16 18 20

To

tal co

st

×104

0

0.5

1

1.5

2

2.5

TSSA


Ha

rd c

on

str

ain

t co

sts

×104

0

0.5

1

1.5

2


Tim

e e

lap

se

d [

s]

0

100

200

300

400

500

600

700


Ite

ratio

ns

0

2000

4000

6000

8000

10000

12000

Figure 4: The graphs show the mean values of different instances taken when varying the num-ber of classes per week for each curriculum. The top left graph shows the mean total cost ofthe timetables that were constructed for each variation. The top right graph shows the meanhard constraint costs of the timetables that were constructed. The bottom left graph shows thetotal time elapsed in seconds and the bottom right shows the total amount of iterations for eachvariation. The red horizontal line indicates the limit set in this timetabling problem.

The resulting cost from each table from each instance were summed and divided with thetotal number of runs. This gave the total mean value of both algorithms. These values werecompared and viewed as a measure of how good these algorithms performed across varying setsof resources. These mean values are presented in table 3.

21

Mean values of the total cost

Parameter µT S µSA µT S−µSA rµ %T 2133 2052 81 3.9R 1398 1439 −41 −2.8ρ 6842 6744 98 1.5σ 1461 1395 66 4.7

Mean values of hard constraint costs

Parameter µT S µSA µT S−µSA rµ %T 1196 1042 154 14.8R 456 439 17 3.9ρ 5643 5509 134 2.4σ 536 408 128 31.4

Table 3: The table shows the summed mean values of each algorithm, µT S and µSA, for allvariations of parameters. rµ shows the relative difference between the mean values with respectto µSA.

Despite SA having a lower cost in almost all of the instances, the summed mean valuefrom table 3 showed small differences between the algorithms considering the absolute meanvalues. The deviations when varying T and σ , represented as error bars in figure 5, showslarger variations when both scaling and difficulty of the problem were increased. This suggestsdifferent paths in a more complex solution space were explored which gave rise to different endresults.

22

Days30 40 50 60 70 80 90

Tota

l cost

0

1000

2000

3000

4000

5000

6000

Tabu Search

Days30 40 50 60 70 80 90

Tota

l cost

0

1000

2000

3000

4000

5000

6000

Simulated Annealing

Unavailable timeslots2 3 4 5 6

Tota

l cost

0

500

1000

1500

2000

2500Tabu Search

Unavailable timeslots2 3 4 5 6

Tota

l cost

0

500

1000

1500

2000

2500Simulated Annealing

Figure 5: Total cost of the constructed timetables when varying the number of days and thenumber of unavailable timeslots per week for each teacher. The red error bars represent thedeviations from each set of resources.

6 Discussion

These sets of data were synthetically crafted and not benchmarking sets. Comparisons betweenother results of similar setup were therefore hard to make. A big problem in this field is thatthere is no general definition of the timetabling problem and hence no smooth and solid way ofcomparing results. Without a common ground, valid comparisons across different experimentscannot be done. This is the main reason why MN was founded and has grown as a community.But real world problems are seldom similar to each other which makes evaluations of differentproblems still valuable. Particular problems may also underline the difficulties in comparing andsolving different timetabling problems.

23

6.1 Control parameters

In this evaluation, efforts in establishing a common ground for better comparison were made.The initially generated timetables for every execution and the procedure of finding neighboringtables were the same for both algorithms. Therefore the difference in these two approachesof solving for a timetable were mainly their metaheuristics features and the parameters thatcontrolled these procedures. While little attention and efforts in tuning these parameters werespent, they still contributed in their way on outcome of the results.

6.1.1 Temperature

The determination of TSA was posed as a boundary problem. An initial temperature, T0, wasset according to the desired probability of accepting worse solutions in the beginning of the run.The temperature was defined as to decrease exponentially and reach a value such that the desiredprobability of accepting a worse solution would be 1 % at the end of the run. Varying the bound-aries or having defined the temperature in another way drastically changed the performance ofhow SA solved for solutions.

6.1.2 Tabu list

It was observed during the extraction of data that TS did not visit many of the solutions stored inthe tabu list. This could be due to the fact that the number of neighborhoods for each solutionswas significantly higher than the maximum number of timetables the tabu list could contain.The metaheuristic guidance for TS was therefore not utilized to its full potential and may be thereason why this approach lacked behind in almost all of the variations that were made.

6.2 Finding neighboring timetables

The operators that were used to generate neighborhood tables may have affected the generationof feasible solutions. They may have been situations where only one exchange of mappingsbetween C3 and E would not have been enough to escape a local minimum in the solutionspace. Other operators such as multiple remappings in the same table, ordered or randomly, mayhave been required to overcome local minimums and finding feasible solutions. Both algorithmsgenerated neighbors in the same way and suffered equally but ultimately none of them solvedfor feasible solutions when increasing the parameters that were varied.

6.3 Cost parameters

The cost parameters played a big role when calculating the initial and final outputs of (2). Onlythe soft constraints were considered in the these calculations. Since the acceptance function wasdependent on the cost difference, ∆C, big differences between the hard and soft constraints wasnotable in these calculations. Efforts in fine tuning these cost parameters were not considered.Marginally higher values were instead assigned to the cost parameters of the hard constraintsthan the soft constraints. This made the output of (2) reconsidering violating hard constraints

24

negligibly small. A smaller difference between the cost parameters of the hard and soft con-straints would have given SA more opportunities to reconsidering violating hard constraints andin this maybe finding its way to feasible solutions.

7 Conclusion

Comparisons between Tabu Search and Simulated Annealing were made on the problem ofcurriculum-based course timetabling. A simulated problem was constructed and both algorithmswere implemented as fair as possible. Four parameters were varied and the algorithms solved forthe best possible solutions under a limited amount of time and computational capacity. WhileSimulated Annealing showed to performed better in the majority of these runs, the overall dif-ference of the mean performance across all sets of resources that were used were small. Atypical difference in the mean total cost of these two algorithms were 72 which in this prob-lem corresponded to around one hard constraint violation. The relative difference between themean total costs with respect to the slightly better algorithm (SA) were not more than 5 % whilecorresponding quantity varied a lot between the hard constraints.

Due to time limitations of this evaluation, more data for each set of resources were notextracted. If more data could have been acquired, a proper statistical analysis could have beendone which would have given credibility to the conclusions. Despite of this, this paper showsthe difficulties in timetabling problems. Much can be learned and used to improve future workof these kinds of problems. And to make the analysis valuable for future work, common andestablished benchmarking sets of resources should instead be used as an anchor point. This waycomparisons between other participants can be made and varying parameters from these setsmay better show how it affects the results for different algorithms.

25

References

[1] - Burke, Edmund, et al. "Automated university timetabling: The state of the art." The com-puter journal 40.9 (1997): 565-571.[2] - Lewis, Rhydian. "A survey of metaheuristic-based techniques for university timetablingproblems." OR spectrum 30.1 (2008): 167-190.[3] - Elmohamed, MA Saleh, Paul Coddington, and Geoffrey Fox. "A comparison of annealingtechniques for academic course scheduling." Practice and Theory of Automated Timetabling II.Springer Berlin Heidelberg, 1998. 92-112.[4] - Bonutti, Alex, et al. "Benchmarking curriculum-based course timetabling: formulations,data formats, instances, validation, visualization, and results." Annals of Operations Research194.1 (2012): 59-70.[5] - Rossi-Doria, Olivia, et al. "A comparison of the performance of different metaheuristics onthe timetabling problem." Practice and Theory of Automated Timetabling IV. Springer BerlinHeidelberg, 2003. 329-351. [6] -Lü, Zhipeng, and Jin-Kao Hao. "Adaptive tabu search forcourse timetabling." European Journal of Operational Research 200.1 (2010): 235-244.[7] - Helena S. Interviewed by: Daniil B. Royal Institute of Technology, March 6, 2015.[8] - Blum, Christian, and Andrea Roli. "Metaheuristics in combinatorial optimization: Overviewand conceptual comparison." ACM Computing Surveys (CSUR) 35.3 (2003): 268-308.[9]- Bettinelli, Andrea, et al. "An overview of curriculum-based course timetabling." TOP(2015): 1-37.[10] - Aladag, Cagdas Hakan, Gulsum Hocaoglu, and Murat Alper Basaran. "The effect ofneighborhood structures on tabu search algorithm in solving course timetabling problem." Ex-pert Systems with Applications 36.10 (2009): 12349-12356.[11] - Van Laarhoven, Peter JM, and Emile HL Aarts. Simulated annealing. Springer Nether-lands, 1987.

26

a comparative evaluation of metaheuristic approaches to ...813863/fulltext01.pdf · automated...

Documents