approximation by reweighting

Approximation by Reweighting

Approximation by Reweighting

By Simion Novikov

Outline

Discrepancy

Spanning trees

Set Cover

Guarding an art gallery

Set Cover by Linear Programming

Discrepancy

For a range space let be a coloring.

The discrepancy for a coloring is

The discrepancy for the range space is

Bound on Discrepancy

For , divide into pairs, and color each endpoint differently, randomly choosing in each pair.

Now only depends on the pairs that are being crossed i.e. where one point is inside the range, and the other is outside.

By Chernoffs inequality, there is probability 0.5 that for all ranges, , where is the number crossing pairs.

from discrepancy.

The Theorem states:

Define:

Then for an s such that

is the required sample size.

For

This gives

Assumptions

We can handle large numbers in per operation.

We only need addition and comparing.

We can pick random elements from a weighted set.

Picking r elements takes

Spanning Tree with low crossing number

Reminder: Spanning Tree

Given N points, compute a tree that connects them all.

Kruskals algorithm:

At each step, add to the graph the edge with the least weight out of those that connect between different connected components.

For graphs, this takes

ST with low crossing number

On a plane, we are given a set of points P.

The crossing number of a tree is the maximum number of segments that can be crossed by any single line.

We want a spanning tree with a minimal crossing number.

Crossing number is 2.

Crossing number is 3.

Any line?

Instead of taking all the lines, well only consider a representative, finite, set L of lines that separate P into different halves.

The number of lines in L is . Rotate each such line clockwise. It will hit 2 points. For 2 different lines these 2 points will be different. Therefore the number of lines is .

Crossing Distance:

For the set of lines L, and points p and q:

Crossing Distance is the number of lines that need to be crossed to get from one point to another.

Definitions:

An Arrangement of lines is the set of all the regions created by the lines in L.

The number of vertices is .

The ball around p of radius r using the crossing metric is: . is the number of vertices inside the ball.

= 25

Algorithm

Kruskals algorithm with changes:

Weight of a segment between two points is dynamic. It depends on which segments we added so far.

The algorithm only guarantees that the result will be no worse than .

Consider n points arranged in a square grid.

At iteration i:

is the set of segments added so far.

For a line in L and a segment between points in P:

This defines the weights of the segments for the algorithm.

Lower bound for size of ball

Lemma 1:

Proof: start walking from p until you cross lines. Then from all the intersections follow the intersected lines for another crossings. All the vertices are less than r away from p, and they were counted no more than twice. This gives the lower bound.

Counted from both an

Upper bound on weight of new segment

Lemma 2: given set P of points and a set L of lines where the total weight of the lines is W,

Proof: Replace each line in L by lines:

Define

()=3

Upper bound on weight of new segment- contd.

By lemma , if the balls in dont intersect . If they do, there are 2 points with a distance of 2r between them.

The number of vertices in total is .

So by expanding r, we expand the balls in until two of them touch. This has to happen before is the whole plane, i.e. it has to happen when .

Therefore, there are 2 points p and q such that

Upper bound on weight

Denote to be the total weight of the lines at iteration i in the algorithm.

By lemma 2:

Conclusion

Therefore,

The sum of the series is bounded:

Therefore:

Therefore for the heaviest line , which crosses segments,

Applications

Discrepancy:

Take the resulting tree, and turn it into a cycle by walking on its outside.

Then take shortcuts skipping vertices that were already visited.

Discrepancy- cont.

The first step doubled the crossing number, while the second step only reduced it. If we now take only the even edges, we have a perfect matching with a crossing number .

Plugging this into the formula for discrepancy we get:

Spanning Tree for space with bounded shattering dimension.

ST with low crossing number

On a plane, we are given a set of points P

We want to compute a ST so that the minimal number of edges are crossed by any single line.

Range Space with shattering dimension and dual shattering dimension .

Range.

Any line?

Instead of taking all the lines, well only consider a representative, finite set L of lines that separate P into different halves.

Rotate each such line clockwise until it hits 2 points. Therefore the number of lines is .

Range?

Instead of taking all the ranges, well only consider

The shattering and dual shattering dimension of are still bounded by and , therefore

Algorithm

Same as with minimal crossing number with ranges from instead of lines.

The algorithm only guarantees that the result will be no worse than

At iteration i:

is the set of segments added so far.

for a range and s a segment between points in P:

Analysis-definitions

So is the set of ranges that are crossed by the segment

). )

has a shattering dimension bounded by 2, since its ranges are just the symmetric difference between 2 ranges in .

Analysis

Frompick a sample N of size . By the -net theorem this has constant probability of being a -net, i.e. there exists one, so lets assume N is indeed a -net of .

By definition of -net, if has no elements from N, then its weight is less than .

However, the number of ranges inside N is:

.

Analysis

Lets pick . Then .

Then not all the points of P are separated by the ranges in N.

For a pair of such points u and v, that means all the ranges that separate them are not in the net. So has no elements from the net. Therefore its weight is less than .

Conclusion

Like previously:

Then taking the log, we get that the heaviest range is

Application

Discrepancy:

Plugging into the Theorem, we get:

Geometric Set Cover

For range space with dual shattering dimension , find the minimal number of ranges that cover all of X.

Definitions:

The number of ranges is m.

Optimal solution has k ranges

Simple greedy algorithm gives

Pick the range the most uncovered elements.

Algorithm

Assign weight to each range

Pick a sample R from of size ) where .

For a point , if ) then double the weight of ranges in .

Repeat until R covers X, or iterations have passed. Then double k and repeat.

Analysis

is the ith iteration in which ranges were doubled and where the sample was an -net.

When we double the ranges that contain p, one of them is in the optimal solution.

is the number of times the jth range in the optimal solution was doubled in the first iterations.

Analysis-cont.

The weight of the world is at least the sum of the weights of the optimal ranges.

Therefore:

Therefore, the weight of the optimal ranges goes up faster than the weight of the world, so at some point the algorithm will end.

That is at .

Analysis-Running Time

Each iteration takes :

- Time to process point p.

Time to pick a sample and check it against all the points.

Total time:

Guarding an art gallery

Given a simple polygon P we need to place a minimal number of points so that all of P is visible from these points.

Visibility polygon:

Triangulation is the division of a polygon by triangles:

The number of triangles is .

They form a tree.

3-coloring:

Lets pick one of the triangles and color it. This forces the coloring of all the neighbors. If we continue, since this is a tree, 2 forcings wont meet and contradict each other.

Now each triangle has a vertex from one of the 3 colors

Obviously, picking one color to be the guards covers all the triangles, so the number of guards is no more than .

Triangulation

Our solution.

is the range space that we want to solve the set-cover problem for.

To reduce to the previous set-cover, we need to prove that the dual shattering dimension is bounded. We do this by bounding the VC dimension.

In an actual solution, we need to discretize the range space, by providing the possible guard locations X:

, where is a set of representatives from the arrangement generated by the ranges .

X could be:

The vertices. This is .

The vertices and the vertices of the arrangement of the visibility polygons of the vertices. This is .

(Ben-Moshe et al. via James Alexander King) It is possible to construct a set X of size that will contain an optimal solution.

As long as it is polynomial, we are only proving the reduction.

Example

Gallery

Visibility polygons:

Example

Arrangement: Representatives:

Now

A range in S would be

or

An example of a solution would be .

Reducing to Set Cover

Now we need to prove that the VC dimension of is a constant.

Lets take to be a set of size k that is shattered by the ranges in . Then for every subset , there is a point so that and .

Define to be the number of subsets of R that can be seen from inside polygon .

Define to be the number of subsets of R that can be seen from inside polygon , where R is separated from by a line.

=4

Bounding

Let .

In , is separated from by a line . Let be the half plane on the side of .

Every point in R generates a wedge in

Each region in the arrangement of the wedges has a different set of points in R that it sees.

Bounding -cont.

Lets consider the radial order of the points in R from a point in : The order of the point is by .

The order is undefined only if two points in r have the same angle to . But

Therefore all the possible lines between points in (), and the lines in the wedges (), divide into regions. Each region sees its own subset of R, and has its own radial ordering for them.

Introducing back, the edges of limit the view from a point in to a wedge.

This introduces another possibilities.

In total, there are possibilities.

Triangulation.

From the triangulation of the polygon, lets take out a triangle so that none of the created parts has more than of the points from .

The triangle that was taken out has complexity , since there are k points and no obstacles inside the triangle (so each visibility polygon sees a wedge inside the triangle.)

Bounding

Let be the 3 parts, and be the subsets of G inside those parts.

This gives .

But . (since G is shattered)

Therefore , that is a constant.

Set-Cover through Linear Programming

Linear Programming

For a set of variables:

Maximize a linear equation.

While the variables are constrained by a series of linear equations.

Integer programming:

Same as linear, but the answer has to be in integers.

IP is NP-HARD.

We assume we have an LP-solver.

Translation

In the language of Integer Programming, Set cover becomes:

P is the set of elements, is the set of ranges.

is a range in , is whether this range was chosen for the covering set.

,

Linear Programming

Changing to Linear programming, now could be in instead of only or .

Let ,

Now divide the whole problem by :

Final equation:

Or alternatively:

Where

Solution

After we solve this problem with the LP-Solver, we get a value for each range between 0 and .

The constraints give us that for all the points, the weight of the ranges covering an element is more than , while the total weight of the ranges is .

Thus for dual range space with the weight as per the LP solution, an -net will contain a subset of R such that every element of with weight of more than , that is all of them, is covered.

Analysis

An -net is of the size:

So the algorithm is:

Run the LP-solver on the equations.

Obtain the -net from the resulting dual range space.

END

approximation by reweighting

Documents

number of lines

set of lines

different lines

set of points p

set p of points

minimal crossing number

number of vertices

intersected lines