1 approximation algorithms for min-max generalization problems piotr berman and sofya raskhodnikova...
TRANSCRIPT
1
Approximation Algorithms for Min-Max Generalization Problems
Piotr Berman and Sofya RaskhodnikovaPennsylvania State University
Input: data items with weights, a lower bound wlb
Goal: partition items into groups of weight ≥ wlb while minimizing the maximum weight of a group
Rules of legal partitioning for MIN-MAX generalization problems• MIN-MAX GRAPH PARTITIONa graph into connected subgraphs
• MIN-MAX RECTANGLE TILING2-dim array into non-overlapping contiguous rectangles
• MIN-MAX BIN COVERINGunstructured data into arbitrary groups
Min-Max Generalization [Du Eppstein Goodrich Lucker 09]
2
58
31
24
61
1 12
32
1
1
3 21
9
image source for BIN COVERING: http://www.developerfusion.com/article/5540/bin-packing/4 and Michael Goodrich’s slides on [DEGL 09]
3 4 5 6 31 1 1 1 32 3 1 1 21 2 3 4 13 4 5 6 3
Motivation [Du Eppstein Goodrich Lucker 09]
Generalization is used in privacy preserving data mining to achieve k-anonymity [Samarati Sweeney 98]
• replace each value with a more general one so that• no category has fewer than k members (weight lower bound wlb = k)
• unstructured data → MIN-MAX BIN COVERING• GPS coordinates → MIN-MAX RECTANGLE TILING• ZIP codes → MIN-MAX GRAPH PARTITION
3Image source: ℓ-Diversity: Privacy Beyond k-Anonymity Ashwin Machanavajjhala Johannes Gehrke Daniel Kifer Muthuramakrishnan Venkitasubramaniam Department of Computer Science, Cornell University
Disclaimer:• Known issues with k-anonymity• Use differential privacy [Dwork McSherry Nissim Smith 06] when possible
Image source: http://www.knowldentitlow.com/PrivacyPolicy.php
• Geographic Information Systems– Partition geographic information into pages that can be transmitted to a mobile device / retrieved from secondary storage– Graph of road network → MIN-MAX GRAPH PARTITION– Coordinate data → MIN-MAX RECTANGLE TILING
• Scheduling– Assign jobs to workers to minimize makespan– Union rules: each worker must be hired for ≥ wlb hours
→ MIN-MAX BIN COVERING
Additional Motivation
4image sources: http://gisagents.blogspot.com/2009/08/new-publication-random-planar-graphs.html,
• Grandfather Frost– Distribute presents of different values to kids– Each kid must get a bundle of value at least wlb
– Minimize the value of max bundle, to avoid jealousy → MIN-MAX BIN COVERING– Maximize the number of kids who get presents → classical BIN COVERING
http://www.travelforlife.ru/articles/troyka.jpg.att.jpg, http://thedailymind.com/wp-content/uploads/2008/04/bad-job-china.jpg
Results on Min-Max Generalization Problems
5
Min-Max Problem Hardness [DEGL 09]
Ratio in [DEGL 09]
Our Ratio
GRAPH PARTITION 2 -- 3 on 3-connected planar graphs 1.33 4 on 4-connected planar graphs -- 3 2.5BIN COVERING 2 2 + ε in time
exp in ε-12
RECTANGLE TILING 1.33 5 4 with 0-1 entries -- -- 3
Main Result
Completely resolved
3 4 5 6 31 1 1 1 32 3 1 1 21 2 3 4 13 4 5 6 3
Other Related Work
• Classical BIN COVERING[Assmann Johnson Kleitman Leung 84, Csirik Johnson Kleitman 01, Jansen Solis-Oba 03, …]
• Rectangle tiling problems with other optimization criteria[Manne 93, Khanna Muthukrishnan Paterson 98, Sharp 99, Smith Suri 99,
Muthukrishnan Poosala Suel 99, Berman DasGupta Muthukrishnan Ramaswami 01, Berman DasGupta Muthukrishnan 02, …]
6
Our 3-Approximation for MIN-MAX GRAPH PARTITION
Input: undirected graph with vertex weights, a lower bound wlb
Goal: partition the graph into connected subgraphs of weight ≥ wlb while minimizing the maximum weight of a part
Phases of the Algorithm1. Construct a preliminary 2-tier partition2. Improve the initial 2-tier partition3. Run a 2-approximation algorithm of [Lenstra Shmoys Tardos 90] for
Scheduling on Unrelated Parallel Machines (SUPM)4. Repair parts of insufficient weight
SUPM Input: m parallel machines, n jobs, processing times pji for job j on machine i Goal: schedule each job on some machine while minimizing the makespan
• The reduction to SUPM is gray-box: black-box in the algorithm, but we look inside the box for the analysis.
7
58
31
24
61
1 12
32
1
1
3 21
9
MIN-MAX GRAPH PARTITION: Definitions/Assumptions
• A node (or a subgraph) is lean if its weight is < wlb; fat otherwise.
• W.l.o.g. assume the input graph is connected and wlb =1.• In this talk: assume all nodes are lean.
8
2-Tier Partition
1. Partition nodes into lean groups
2. Partition groups into fat supergroupsNodes in a group (supergroup) must induce a connected subgraph
Intuition: supergroups ≡ parts in a legal partition groups ≡ nearly indivisible subgraphs Recall: want a 3-approximation → if all supergroups have ≤ 3 groups, done.
9
wt ≥1
wt<1
Types of Supergroups
10
Group-pair: consists of 2 groups
Triangle: consists of 3 groups pairwise connected by an edge
Star: consists of ≥ 3 groups When the central group is removed, remaining groups are connected components
central group
Example:
star group-pair triangle
central group
1. Form groups greedily:a. Make each node a group.b. While there are adjacent groups G, H such that G H ∪ is lean, merge G and H.
2. Form supergroups greedily:a. Form one supergroup that includes all groups.b. While there is a supergroup with ≥ 4 groups that is not star, split it.
Claim: In the initial 2-tier partition3. G H ∪ is fat for all adjacent groups G,H.4. Each supergroup is a group-pair, a triangle or a star.
Phase 1: Obtaining Initial 2-Tier Partition
11
Mobile Groups
A group is mobile if it is not in a group-pair and it is not a central group.I.e., if it is removed from its supergroup, the supergroup is still of one of the
allowed types.
12
m
m
m m m m
Goal of Phase 2 of the Algorithm
13
mmm
mm
mm
Structured supergroups
stars with central nodes
Other supergroups
supergroups with ≤ 3 groups
Mobile groups of structured supergroups are adjacent only to central nodes of structured supergroups.
They will be repartitioned among using a reduction to SUPM.
m
m
m
m
m
Transformations: Perform the first that applies
1. Form a group-pair from 2 adjacent mobile groups if they belong to different supergroups or to a supergroup with ≥ 4 groups.
2. Split central group if it has nodes u,v to which mobile groups connect.
14
mm
u vm m
u vm m
Result: No adjacent mobile groups, except for groups in the same triangle.
Result: Each central group has a unique central node to which mobiles connect.
What Remains to Achieve Our Goal
15
mmm
mm
mm
Structured supergroups
stars with central nodes
Other supergroups
supergroups with ≤ 3 groups
Mobile groups of structured supergroups are adjacent only to central nodes of structured supergroups.
They will be repartitioned among using a reduction to SUPM.
m
m
m
m
m
3. If a mobile from a star with ≥ 4 groups “chain connects” to a group-pair, chain reconnect.
Last Transformation
16
Result: Goal achieved.
m m m
m m m
Phase 3 of the Algorithm
17
1. “Other" supergroups become parts in the final partition.2. Each central node of a structured supergroup gets its own part.3. Mobiles of structured supergroups are repartitioned among central nodes.4. SUPM instance: are machines, and are jobs, weights are processing times.5. Other nodes from central groupsjoin parts of their central nodes.
mm
mmm
mm
Structured supergroupsstars with central nodes
Other supergroupssupergroups with ≤ 3 groups
m
m
m
m
m
m
Analysis highlights
1. SUPM algorithm gives output of value ≤ OPTf +1, where OPTf is a fractional optimum.
2. OPTf ≤ our OPT3. Our guarantee: OPTf +1 + wt(nodes in a central group) ≤ OPT+2
Mobile groups of structured supergroups are adjacent only to central nodes of structured supergroups.
Results on Min-Max Generalization Problems
18
Min-Max Problem Hardness [DEGL 09]
Ratio in [DEGL 09]
Our Ratio
GRAPH PARTITION 2 -- 3 on 3-connected planar graphs 1.33 4 on 4-connected planar graphs -- 3 2.5BIN COVERING 2 2 + ε in time
exp in ε-12
RECTANGLE TILING 1.33 5 4 with 0-1 entries -- -- 3
Main Result
Completely resolved