heuristic search and information visualization methods for school redistricting university of...
TRANSCRIPT
Heuristic Search and Information Visualization Methods for School
Redistricting
University of Maryland Baltimore County
Marie desJardins, Blazej Bulka, Ryan Carr, Andrew Hunt, Priyang Rathod, and Penny Rheingans
This work was partially supported by NSF # IIS-0414976.Thanks to David Drown and the Howard County Public School System for data and valuable inputs.
July 18, 2006 2
Overview
The Problem: School Redistricting
Searching for Good Plans
Results
Future Work and Conclusions
July 18, 2006 3
The Problem: School Redistricting
July 18, 2006 4
School Redistricting
Assign neighborhoods (or planning polygons) within a school district to schools while considering multiple factors, such as busing costs, test score distribution, and school utilization
Finding the best assignment (or plan) is a multiattribute optimization problem
Also want to generate qualitatively different plans that represent tradeoffs among the criteria, and help users visualize these tradeoffs
Search space is very large: O(sp), where s is the number of schools (12 high schools; 30 elementary) p is the number of polygons (~263)
Currently in Howard County, Maryland, the process is almost entirely manual
July 18, 2006 5
Evaluation Criteria
Educational benefits for students Frequency with which students are redistricted Number and distance of students bused Total busing cost Demographics and academic performance of schools Number of students redistricted Maintenance of feeder patterns Changes in school capacity Impact on specialized programs Functional and operational capacity of school infrastructure Building utilization
July 18, 2006 6
Evaluation Criteria
Educational benefits for students Frequency with which students are redistricted Number and distance of students bused Total busing cost Demographics and academic performance of schools Number of students redistricted Maintenance of feeder patterns Changes in school capacity Impact on specialized programs Functional and operational capacity of school infrastructure Building utilization
July 18, 2006 7
Evaluation Criteria
1. Number of students bused Students who can walk to a school should be assigned to that school
2. Busing cost Estimated as a population-weighted sum of polygon-school distances
3. Demographics FARM (Free and Reduced Meal) ratio at each school should ideally be the
same as that of the county as a whole
4. Academic performance MSA (Maryland State Assessment) scores at each school should ideally be
the same as those of the county as a whole
5. Capacity Each school should be between 90% and 110% of available capacity
Penalty functions are defined for each of the five criteria above per-polygon, per-school, per-plan cost measure from 0 (good) to 1 (bad)
July 18, 2006 8
Selecting Multiple Plans: Diversity
One plan dominates another if it is better along all dimensions
Two plans are incomparable if each is better than the other along at least one dimension
A good set of plans should: contain no dominated plans consist of qualitatively different plans
We measure “qualitatively different” using Euclidean distance in the evaluation space: Div(P) = 1/|P| pi, pj P Dist(pi, pj)
July 18, 2006 9
Closest-School Plan
Marriottsville High(new)
July 18, 2006 10
Closest [outer] vs. Recommended [inner]
July 18, 2006 11
Searching for Good Plans
July 18, 2006 12
Multiattribute Optimization
Previous approaches to multiattribute optimization: Weighted methods: Combine attributes into a single weighted sum Priority-based methods: Optimize one attribute, then perform
constrained optimization on the other attributes MOA* (and variations): Find all nondominated solutions using
heuristic search Evolutionary methods: Use genetic search to explore the
population space using recombination and fitness-based selection
Redistricting domain: No single set of weights or prioritization scheme Very large search space can’t find all (or even most)
nondominated solutions Use local search
July 18, 2006 13
Basic Hill-Climbing
Baseline: Choose an initial plan as a starting point (seed) then hill-climb through “[weighted] sum of criteria” space
Seed options: closest-school plan current plan random plan “breadth-first” assignment “minimum-spanning-tree” assignment
July 18, 2006 14
Biased Hill-Climbing
General approach: Choose an initial plan as a seed Hill-climb through “dominated plan” space Save “incomparable plans” as they are encountered Stop at a local maximum
Restart search starting from a plan in the incomparable list
Blind bias: At restart, choose a plan from the incomparable list at random
Diversity bias: At restart, choose the plan that is farthest in evaluation space
from the solutions found so far
July 18, 2006 15
Results
July 18, 2006 16
Quality of Generated Plans
Quality of generated plans is better than manually generated plans ...with respect to the particular evaluation criteria we’ve defined
Original-plan seed does better than closest-plan seed leads to the “wrong” local maximum
Compared to recommended plan, generated plans generally perform: ...better with respect to capacity ...comparably with respect to socioeconomic and academic
measures ...better with respect to busing costs ...comparably with respect to walk utilization
Outer: RecommendedInner: Best generated
July 18, 2006 17
Diversity of Generated Plans
Baselines: Manual plans: closest, recommended, and alternative
(diversity measure = 0.223 ) Unweighted hill-climbing: multiple runs of basic hill-climbing with
different initial seeds (diversity measure = 0.044 ) Weighted hill-climbing: multiple runs of basic hill-climbing with
different weight vectors (diversity measure = 0.048 )
Biased hill-climbing: with blind bias: diversity measure = 0.032 with diversity bias: diversity measure = 0.389
Note: Biased hill-climbing yields a somewhat worse average unweighted sum, so the plans are not quite “as good” in a direct comparison
July 18, 2006 18
Future Work and Conclusions
July 18, 2006 19
Future Work
Modeling additional evaluation criteria: Feeder statistics Redistricting frequency
Incorporating projected future demographic shifts into evaluation, search, and visualization
Extensions to search methods: Other definitions of diversity (e.g., dispersion, similar to k-means
mean-squared error) Other multiattribute optimization methods (particularly genetic
methods) Visualization extensions:
Visualizing feeder patterns Computing and visualizing gradients in search space
Deployment and user testing
July 18, 2006 20
Conclusions
School redistricting is an important and challenging problem
The multiattribute optimization framework is a good paradigm for this application
Novel search techniques and evaluation methods are needed
Diversity-biased hill climbing is a promising initial approach