approximation by reweighting
DESCRIPTION
Approximation by Reweighting. By Simion Novikov. Outline. Discrepancy Spanning trees Set Cover Guarding an art gallery Set Cover by Linear Programming . Discrepancy. For a range space let be a coloring. The discrepancy for a coloring is The discrepancy for the range space is . - PowerPoint PPT PresentationTRANSCRIPT
Approximation by Reweighting
Approximation by Reweighting
By Simion Novikov
Outline
Discrepancy
Spanning trees
Set Cover
Guarding an art gallery
Set Cover by Linear Programming
Discrepancy
For a range space let be a coloring.
The discrepancy for a coloring is
The discrepancy for the range space is
Bound on Discrepancy
For , divide into pairs, and color each endpoint differently, randomly choosing in each pair.
Now only depends on the pairs that are being crossed i.e. where one point is inside the range, and the other is outside.
By Chernoffs inequality, there is probability 0.5 that for all ranges, , where is the number crossing pairs.
from discrepancy.
The Theorem states:
Define:
Then for an s such that
is the required sample size.
For
This gives
Assumptions
We can handle large numbers in per operation.
We only need addition and comparing.
We can pick random elements from a weighted set.
Picking r elements takes
Spanning Tree with low crossing number
Reminder: Spanning Tree
Given N points, compute a tree that connects them all.
Kruskals algorithm:
At each step, add to the graph the edge with the least weight out of those that connect between different connected components.
For graphs, this takes
ST with low crossing number
On a plane, we are given a set of points P.
The crossing number of a tree is the maximum number of segments that can be crossed by any single line.
We want a spanning tree with a minimal crossing number.
Crossing number is 2.
Crossing number is 3.
Any line?
Instead of taking all the lines, well only consider a representative, finite, set L of lines that separate P into different halves.
The number of lines in L is . Rotate each such line clockwise. It will hit 2 points. For 2 different lines these 2 points will be different. Therefore the number of lines is .
Crossing Distance:
For the set of lines L, and points p and q:
Crossing Distance is the number of lines that need to be crossed to get from one point to another.
Definitions:
An Arrangement of lines is the set of all the regions created by the lines in L.
The number of vertices is .
The ball around p of radius r using the crossing metric is: . is the number of vertices inside the ball.
= 25
Algorithm
Kruskals algorithm with changes:
Weight of a segment between two points is dynamic. It depends on which segments we added so far.
The algorithm only guarantees that the result will be no worse than .
Consider n points arranged in a square grid.
At iteration i:
is the set of segments added so far.
For a line in L and a segment between points in P:
This defines the weights of the segments for the algorithm.
Lower bound for size of ball
Lemma 1:
Proof: start walking from p until you cross lines. Then from all the intersections follow the intersected lines for another crossings. All the vertices are less than r away from p, and they were counted no more than twice. This gives the lower bound.
Counted from both an
Upper bound on weight of new segment
Lemma 2: given set P of points and a set L of lines where the total weight of the lines is W,
Proof: Replace each line in L by lines:
Define
()=3
Upper bound on weight of new segment- contd.
By lemma , if the balls in dont intersect . If they do, there are 2 points with a distance of 2r between them.
The number of vertices in total is .
So by expanding r, we expand the balls in until two of them touch. This has to happen before is the whole plane, i.e. it has to happen when .
Therefore, there are 2 points p and q such that
Upper bound on weight
Denote to be the total weight of the lines at iteration i in the algorithm.
By lemma 2:
Conclusion
Therefore,
The sum of the series is bounded:
Therefore:
Therefore for the heaviest line , which crosses segments,
Applications
Discrepancy:
Take the resulting tree, and turn it into a cycle by walking on its outside.
Then take shortcuts skipping vertices that were already visited.
Discrepancy- cont.
The first step doubled the crossing number, while the second step only reduced it. If we now take only the even edges, we have a perfect matching with a crossing number .
Plugging this into the formula for discrepancy we get:
Spanning Tree for space with bounded shattering dimension.
ST with low crossing number
On a plane, we are given a set of points P
We want to compute a ST so that the minimal number of edges are crossed by any single line.
Range Space with shattering dimension and dual shattering dimension .
Range.
Any line?
Instead of taking all the lines, well only consider a representative, finite set L of lines that separate P into different halves.
Rotate each such line clockwise until it hits 2 points. Therefore the number of lines is .
Range?
Instead of taking all the ranges, well only consider
The shattering and dual shattering dimension of are still bounded by and , therefore
Algorithm
Same as with minimal crossing number with ranges from instead of lines.
The algorithm only guarantees that the result will be no worse than
At iteration i:
is the set of segments added so far.
for a range and s a segment between points in P:
Analysis-definitions
So is the set of ranges that are crossed by the segment
). )
has a shattering dimension bounded by 2, since its ranges are just the symmetric difference between 2 ranges in .
Analysis
Frompick a sample N of size . By the -net theorem this has constant probability of being a -net, i.e. there exists one, so lets assume N is indeed a -net of .
By definition of -net, if has no elements from N, then its weight is less than .
However, the number of ranges inside N is:
.
Analysis
Lets pick . Then .
Then not all the points of P are separated by the ranges in N.
For a pair of such points u and v, that means all the ranges that separate them are not in the net. So has no elements from the net. Therefore its weight is less than .
Conclusion
Like previously:
Then taking the log, we get that the heaviest range is
Application
Discrepancy:
Plugging into the Theorem, we get:
Geometric Set Cover
For range space with dual shattering dimension , find the minimal number of ranges that cover all of X.
Definitions:
The number of ranges is m.
Optimal solution has k ranges
Simple greedy algorithm gives
Pick the range the most uncovered elements.
Algorithm
Assign weight to each range
Pick a sample R from of size ) where .
For a point , if ) then double the weight of ranges in .
Repeat until R covers X, or iterations have passed. Then double k and repeat.
Analysis
is the ith iteration in which ranges were doubled and where the sample was an -net.
When we double the ranges that contain p, one of them is in the optimal solution.
is the number of times the jth range in the optimal solution was doubled in the first iterations.
Analysis-cont.
The weight of the world is at least the sum of the weights of the optimal ranges.
Therefore:
Therefore, the weight of the optimal ranges goes up faster than the weight of the world, so at some point the algorithm will end.
That is at .
Analysis-Running Time
Each iteration takes :
- Time to process point p.
Time to pick a sample and check it against all the points.
Total time:
Guarding an art gallery
Given a simple polygon P we need to place a minimal number of points so that all of P is visible from these points.
Visibility polygon:
Triangulation is the division of a polygon by triangles:
The number of triangles is .
They form a tree.
3-coloring:
Lets pick one of the triangles and color it. This forces the coloring of all the neighbors. If we continue, since this is a tree, 2 forcings wont meet and contradict each other.
Now each triangle has a vertex from one of the 3 colors
Obviously, picking one color to be the guards covers all the triangles, so the number of guards is no more than .
Triangulation
Our solution.
is the range space that we want to solve the set-cover problem for.
To reduce to the previous set-cover, we need to prove that the dual shattering dimension is bounded. We do this by bounding the VC dimension.
In an actual solution, we need to discretize the range space, by providing the possible guard locations X:
, where is a set of representatives from the arrangement generated by the ranges .
X could be:
The vertices. This is .
The vertices and the vertices of the arrangement of the visibility polygons of the vertices. This is .
(Ben-Moshe et al. via James Alexander King) It is possible to construct a set X of size that will contain an optimal solution.
As long as it is polynomial, we are only proving the reduction.
Example
Gallery
Visibility polygons:
Example
Arrangement: Representatives:
Now
A range in S would be
or
An example of a solution would be .
Reducing to Set Cover
Now we need to prove that the VC dimension of is a constant.
Lets take to be a set of size k that is shattered by the ranges in . Then for every subset , there is a point so that and .
Define to be the number of subsets of R that can be seen from inside polygon .
Define to be the number of subsets of R that can be seen from inside polygon , where R is separated from by a line.
=4
Bounding
Let .
In , is separated from by a line . Let be the half plane on the side of .
Every point in R generates a wedge in
Each region in the arrangement of the wedges has a different set of points in R that it sees.
Bounding -cont.
Lets consider the radial order of the points in R from a point in : The order of the point is by .
The order is undefined only if two points in r have the same angle to . But
Therefore all the possible lines between points in (), and the lines in the wedges (), divide into regions. Each region sees its own subset of R, and has its own radial ordering for them.
Introducing back, the edges of limit the view from a point in to a wedge.
This introduces another possibilities.
In total, there are possibilities.
Triangulation.
From the triangulation of the polygon, lets take out a triangle so that none of the created parts has more than of the points from .
The triangle that was taken out has complexity , since there are k points and no obstacles inside the triangle (so each visibility polygon sees a wedge inside the triangle.)
Bounding
Let be the 3 parts, and be the subsets of G inside those parts.
This gives .
But . (since G is shattered)
Therefore , that is a constant.
Set-Cover through Linear Programming
Linear Programming
For a set of variables:
Maximize a linear equation.
While the variables are constrained by a series of linear equations.
Integer programming:
Same as linear, but the answer has to be in integers.
IP is NP-HARD.
We assume we have an LP-solver.
Translation
In the language of Integer Programming, Set cover becomes:
P is the set of elements, is the set of ranges.
is a range in , is whether this range was chosen for the covering set.
,
Linear Programming
Changing to Linear programming, now could be in instead of only or .
Let ,
Now divide the whole problem by :
Final equation:
Or alternatively:
Where
Solution
After we solve this problem with the LP-Solver, we get a value for each range between 0 and .
The constraints give us that for all the points, the weight of the ranges covering an element is more than , while the total weight of the ranges is .
Thus for dual range space with the weight as per the LP solution, an -net will contain a subset of R such that every element of with weight of more than , that is all of them, is covered.
Analysis
An -net is of the size:
So the algorithm is:
Run the LP-solver on the equations.
Obtain the -net from the resulting dual range space.
END