sdq for data set 5 (500 runs, 5 mins each) pdf for relative solution size for data set 5 (500 runs,...

SDQ for data set 5 (500 runs, 5 mins each) PDF for relative solution size for data set 5 (500 runs, 5 mins each)

Data set 1 (69 variables, over-constrained)Search Mean Median Mode Std. Dev. Min MaxBT 57 57 57 0 57 57

LS 47.2 48 48 4.76 29 55

ERA 32 32 34 4.46 21 46

RDGR 58.77 59 59 2.31 13 61

RGR 57.94 58 58 3.95 7 62

Statistics of solution size (500 runs, 5 mins each)

Data set 5 (64 variables, tight but solvable)Search Mean Median Mode Std. Dev. Min MaxBT 62 32 62 0 62 62

LS 45.98 47 46 4.69 26 55

ERA 59.62 64 64 5.34 37 64

RDGR 57.87 58 59 5.23 7 63

RGR 56.13 58 58 10.07 4 63

Statistics of solution size (500 runs, 5 mins each)

SDQ for data set 1 (500 runs, 5 mins each) PDF for relative solution size for data set 1 (500 runs, 5 mins each)

10

15

20

25

30

35

40

45

1 20 39 58 77 96 115 134 153 172 191

iteration

# ag

ents

in z

ero

posi

tion

spring 2001b (O)

fall 2002 (O)

ERA performance on unsolvable problems

How interactive selection works:1. Manager chooses perspective to work from2. An updated and sorted list of possible and consistent choices is provided3. Manager selects a choice and makes an assignment4. Constraint propagation is performed and removes choices that are no longer available

(alternatively, restore choices that were ruled out). This ensures that the manager is always only presented with consistent options.

# total agents : 65# agents involved in deadlock: 24# unused GTAs: 8

(colorless) agent in zero position

(colorful) agent in deadlock

152025303540455055606570

1 20 39 58 77 96 115 13 15 17 191

iteration

spring 2001b (B)

fall 2002 (B)

fall 2001b

spring 2003

ERA performance on solvable problem

An Interactive, Constraint-Based System for Task Allocation in an Academic EnvironmentRyan Lim, Venkata Praveen Guddeti, Venkateshwar Rao Thota, Hui Zou, and Berthe Y. Choueiry

http://cse.unl.edu/~gtaConstraint Systems Laboratory • Computer Science & Engineering • University or Nebraska • {rlim|vguddeti|vthota|hzou|choueiry}@cse.unl.edu

Project Summary

This system has yielded research contributions in the following areas:1. Formulation of the GTA assignment problem as a Constraint Satisfaction Problem (CSP)

[2]. Design a new convention for consistency checking to deal with over-constrained

problems. Reformulation of some global constraints into binary ones, and evaluation of the

computational benefits of the reformulation.2. Design, implementation, and deployment of a prototype for data acquisition and for

interactive problem solving.3. Design, evaluation, and new characterization of both heuristic and stochastic search

techniques for automatically solving the problem. Heuristic backtrack [5] Stochastic local search [9] Multi-agent ERA search [8] Randomized backtrack [5]

This project has opened up the following research directions:1. A portfolio mechanism for on-line hybridization of search techniques2. Development of constraint-based techniques for data-warehousing and compact

representation of solution.

The practical benefits of the research conducted so far:1. A number of research results published and presented in international scientific meetings.2. Training of undergraduate and graduate students in Constraint Processing techniques,

and production of a number of dissertations.3. Practical benefit for the department: decreased time and effort for finding a solution,

reduced the number of assignment conflicts and modifications, improved matching of GTAs to classes.

System ArchitectureThe system has the following main components:1. A GTA web-interface accessible to GTAs for

application2. A manager web-interface for data

management and decision making1. View/edit GTA records2. Setup classes3. Specify constraints4. Perform interactive selections or

automated search3. A relational database to store collected data.4. Facilities for interactive decision making and

exploration of solutions and bottlenecks.5. A variety of search algorithms for automated

problem solving.

The architecture of our system

Problem Modeling and Constraint ReformulationProblem definition: Given a set GTAs, a set of courses, and a set of constraints that specify the allowable assignments of GTAs to courses, the goal is to an assignment that is: Consistent: the assignment breaks no constraint Satisfactory: maximizes the number of courses covered and the happiness of the

assigned GTAs.

Unary

• ITA certification – GTA must be ITA qualified to teach the constrained course

• Enrollment – GTA cannot be enrolled in the constrained course.

• Overlapping – GTA cannot be assigned to a course that requires an instructor if he/she is enrolled in a course at the same time

• Zero Preference – GTA cannot have a preference of 0 for the course.

Binary • Mutex – Courses cannot be assigned the same GTA.

Non-binary

• Equality – all courses should be assigned the same GTA

• Capacity – no GTA should be assigned to a workload that exceeds his/her capacity

• Confinement – assignments to two specific sets of courses should be mutually exclusive

Reformulation of non-binary constraints: A constraint is network decomposable [2] when it can be represented by an equivalent network of binary constraints.

Interactive Decision MakingInteractive Selection allows the manager to interactively make decisions by examining the problem from two perspectives:

Backtrack search

The main issues in backtrack search are the following1. The problem is always tight (difficult to solve) and often over constrained (not enough

GTAs hired)• We modified the basic backtrack mechanism to handle over-constrained

problems2. The performance of search depends on the sequence in which decisions are made (i.e.

variable and value ordering)• We developed various ordering heuristics, and evaluated them with static and

dynamic decision strategies 3. The branching factor of the search tree is particularly large

• We identified the reasons for thrashing and characterized the thrashing behavior, which we solve with randomization (see Randomized Backtrack Search)

Stochastic Local Search

• This is a hill-climbing search using the min-conflict heuristic for value selection.• Consistent assignments are not undone (greedy). • Constraint propagation is used to handle non-binary constraints.• Random Walk used to avoid local optima, random restarts used to recover from them.

Data set

# of vars

BT running for …

5 mins 6 hours

Max depth

Shallowest Max depth

Shallowest

Level % Level %

I 69 57 53 23 57 51 26

II 65 63 55 15 63 54 16

III 54 52 44 18 54 41 24

IV 59 49 48 18 50 45 23

V 54 62 54 15 62 47 26

VI 31 28 13 58 28 3 90

BT search thrashing.

Reformulation of the non-binary confinement constraint into a binary confinement constraint.

Reformulation of the non-binary equality constraint into a binary equality constraint.

Multi-Agent Search

• Multi-agent based search using the ERA (environment, reactive rules, agents) algorithm.• Agents are variables that seek to occupy good positions in the environment (values).• Environment records the number of constraint violations of each position. • Agent moves according to reactive rules and can force another agent out of a position• The algorithm acts as an ‘extremely’ decentralized local search.• Able to solve instances that remained unsolved by other techniques we tested.• Deadlock in over-constrained problems undermines stability & results in short solutions

However, useful to identify, isolate, and represent conflicts in a compact manner.

Each circle corresponds to a position (i.e., a GTA). Each square represents an agent (i.e, a task). Blank squares indicate that the position is a zero position for the agent. The filled squares indicate that although the position is the best one for the agent, it results in some broken constraints, and the actual assignment of the position to the agent cannot be made.

• Randomization to visit wider area of the space, restarts to overcome thrashing.• Randomization & Geometric restarts (RGR) [Walsh 99]: fixed restart schedule.

Randomization & Dynamic Geometric restarts (RDGR): dynamic restart schedule.

[1] R. Glaubius. A Constraint Processing Approach to Assigning Graduate Teaching Assistants to Courses. Undergraduate Honors Thesis. Department of Computer Science and Engineering, University of Nebraska-Lincoln, 2001.

[2] R. Glaubius and B.Y. Choueiry. Constraint Modeling and Reformulation in the Context of Academic Task Assignment. In Working Notes of the Workshop on Modelling and Solving Problems with Constraints, ECAI 2002, Lyon, France, 2002.

[3] R. Glaubius and B.Y. Choueiry. Constraint Modeling and Reformulation in the Context of Academic Task Assignment. Poster presentation at the Fifth International Symposium on Abstraction, Reformulation and Approximation (SARA 2002), 2002.

[4] R. Glaubius and B.Y. Choueiry. Constraint Modeling in the Context of Academic Task Assignment. In Pascal Van Hentenryck, editor, Proceedings of 8th International Conference on Principle and Practice of Constraint Programming (CP02), volume 2470 of Lecture Notes in Computer Science, page 789, Ithaca, NY, 2002. Springer Verlag.

[5] V. Guddeti, H. Zou, and B.Y. Choueiry. An Empirical Study of Heuristic and Randomized Search Techniques in a Real-World Setting, 2004. Under review.[6] R. Lim, V. Guddeti, and B.Y. Choueiry. An Interactive System for Hiring and Managing Graduate Teaching Assistants, 2004. Under review.[7] H. Zou. Iterative Improvement Techniques for Solving Tight Constraint Satisfaction Problems. Masters thesis, Department of Computer Science and Engineering, University of Nebraska-Lincoln, Lincoln, NE, December

2003.[8] H. Zou and B.Y. Choueiry. Characterizing the Behavior of a Multi-Agent Search by Using it to Solve a Tight, Real-World Resource Allocation Problem. In Workshop on Applications of Constraint Programming, pages 81—

101, Kinsale, County Cork, Ireland, 2003.[9] H. Zou and B.Y. Choueiry. Multi-agent Based Search versus Local Search and Backtrack Search for Solving Tight CSPs: A Practical Case Study. In Working Notes of the Workshop on Stochastic Search Algorithms (IJCAI

03), pages 17—24, Acapulco, Mexico, 2003.

Support: NSF grant #EPS-0091900, Department of Computer Science & Engineering, and Constraint Systems Laboratory.Experiments were carried out on PrairieFire, courtesy of the Research Computing Facility of Computer Science & Engineering.

Domains: GTAs make up the domains of the variables. GTAs may serve as an instructor if he/she is ITA certified. Each GTA may specify a preference value on the scale of 0 to 5 for each course offered.Constraints: We have 3 types of constraints – unary, binary, and non-binary constraints.

Variables: Courses are modeled as variables in our CSP. There are 3 types of courses (lecture, labs, recitation). Lectures may require a grader GTA while labs and recitations require an instructor GTA.

Un

ary

Bin

ary

No

n-b

ina

ry

Spring2001b(B) 1 B Y 69 35 35 29.6 1.18 277 1179 70

Spring2001b(O) 2 O N 69 26 26 29.6 0.88 277 1179 52

Fall2001b(B) 3 B Y 65 35 31 29.3 1.06 267 1676 70

Fall2001b(O) 4 O Y 65 34 30 29.3 1.02 267 1676 68

Fall2002(B) 5 B Y 31 33 16.5 13 1.27 233 1124 66

Fall2002(O) 6 O N 31 28 11.5 13 0.88 233 1124 56

Spring2003(B) 7 B Y 54 36 29.5 27.4 1.08 250 622 72

Spring2003(O) 8 O Y 54 34 27.5 27.4 1 250 622 68

Fall2002(B)-NP 9 B N 59 33 32 29.5 1.08 233 1124 66

Fall2002(O)-NP 10 O N 59 28 27 29.5 0.91 233 1124 56

Spring2003(B)-NP 11 B Y 64 34 33 30.2 1.09 250 622 72

Spring2003(O)-NP 12 O Y 64 34 31 30.2 1.02 250 622 68

Data sets Re

fere

nc

e

Ori

gin

al/B

oo

ste

d

So

lva

ble

?

Ra

tio

= T

C/T

L

Constraints

#C

ou

rse

s/#

va

ria

ble

s

#G

TA

s/D

om

ain

siz

e

To

tal c

ap

ac

ity

(T

C)

To

tal l

oa

d (

TL

)

Characteristics of the GTA data set.

A list of courses with a sorted list of selectable GTAs.

A list of GTAs with a sorted list of selectable courses.

We have developed an interactive, web-based system for hiring and managing Graduate Teaching Assistants (GTAs) at the Department of Computer Science.

24 hr: 51 (26%)1 min: 55 (20%)

Max depth: 57

Shallowest level reached by BT after …

Num

ber

of

varia

bles

: 69

BT search thrashing in a large search space.

Randomized Backtrack Search with Restarts

References

Comparison of Techniques

ERA General: Stochastic and incomplete.

Tight but solvable problems: Immune to local optima and solves tight CSPs.

Over-constrained problems: Deadlock causes instability and yields shorter solutions.

LS General: Stochastic, incomplete, and quickly stabilizes.

Tight but solvable problems: Liable to local optima and fails to solve tight CSPs even with random-walk and restart strategies.

Over-constrained problems: Finds longer solutions than ERA.

RDGR General: Stochastic, incomplete, immune to thrashing, produces longer solutions than BT, immune to deadlock, reliable on unknown instances, and immune to local optima, but less than ERA.

RGR General: Stochastic, approximately complete, less immune to thrashing than RDGR, and yields shorter solutions than RDGR in general.

BT General: Systematic, complete (theoretically, rarely in practice), liable to thrashing, yields shorter solutions than RDGR and RGR, stable behavior, and more stable solutions than stochastic methods in general.

Summary and Future Work

In the future, we plan to:1. Validate our findings on randomly generated problems and other real-world case-studies.2. Design new search hybrids where a solution from a given technique such as ERA is fed

as a seed to another one such as heuristic backtrack search.

Data Set I (69 variables, over-constrained)

CPU run time 30 secs 5 mins 30 mins

1 hour 6 hours 24 hours

Shallowest BT level 54 53 52 52 51 51

Longest solution 57 57 57 57 57 57

Geometric mean of preference 2.15 2.17 2.17 2.21 2.27 2.27

# of backtracks 1835 47951 261536 532787 3274767 13070031

# of nodes visited 3526 89788 486462 989136 6059638 24146133

Performance of BT for various CPU run-times.

sdq for data set 5 (500 runs, 5 mins each) pdf for relative solution size for data set 5 (500 runs,...

Documents

interactive problem

gta assignment problem

automated problem

interactive decision

problem definition

set gtas

set of constraints

number of courses