miklós ajtai, vitaly feldman, avinatan hassidim, jelani nelson presented by dan garber

27
Sorting and Selection with Imprecise Comparisons Miklós Ajtai, Vitaly Feldman, Avinatan Hassidim, Jelani Nelson Presented by Dan Garber

Post on 19-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

Sorting and Selection with Imprecise Comparisons

Miklós Ajtai, Vitaly Feldman, Avinatan Hassidim, Jelani Nelson

Presented by Dan Garber

Sorting \ max-finding algorithms are based on performing comparisons between pairs of elements.

Given two elements to compare, we assume that we can tell which is of greater value.

In some scenarios it is not always possible to assume the above because we don’t always know the “real” values of the elements.

Introduction

Which car do we prefer to buy?

Introduction

How to decide which sports team is champion?◦ Is team A better than team B?◦ Is team D better than team B?

Introduction

A B

A

C D

D

D

We are given a group of n elements; each of them is associated with an unknown “true” numerical value.

Given two elements:

Corollary: its not possible to find the exact maximum or the correct permutation.

Model

,

i i j

i j j j i

i j

x val x val x

compare x x x val x val x

x or x else

The error of a max-finding algorithm A which outputs x is k if:

The error of a sorting algorithm A which outputs a permutation π is k if:

Model

*val x val x k

i j val i val j k

Explore the tradeoff between the error of an algorithm and the number of comparisons.

An algorithm that performs all possible comparisons “knows everything” and can minimize the error.

Can we find algorithms that can achieve the same error bound with less comparisons?

Goal

Experimental Psychology & Sociology◦ Ranking of elements by human subjects.

Marketing Research Information Retrieval

◦ Training algorithms using human evaluators. Designing Sports tournaments

◦ Minimize error while reducing the number of games required.

Motivation

Max finding:

Sorting (bubble sort):

Examples1 2 3 4 5 6 7 8 9 10

max = array[0];for (i=1; i < 10; i++) if (array[i] > max) max = array[i];

do swapped = false; for (i=0; i < 9; i++) if (array[i] > array[i+1]) { swap(array[i], array[i+1]); swapped = true; }while swapped

1. Lower error bounds2. Maximum finding

a. Error 2 algorithmb. Error k algorithm

3. Sortinga. Sorting with error 2b. Selection with error kc. Sorting with error k

4. Lower bounds

Agenda

Theorem 1. sorting according to the number of wins in a round-robin tournament yields error 2.

Proof.◦ Let x,y such that: val(y)+2 < val(x).◦ For any z: y defeats z x defeats z. ◦ x defeats y.◦ x has strictly more wins than y.

Lower error bounds

Theorem 2. no deterministic max-finding algorithm has error less than 2.

Proof.◦ Assume three elements: a,b,c. ◦ The comparator can claim: a>b>c>a.◦ w.l.o.g assume the algorithm outputs a as max.◦ The values of a,b,c could be 0,1,2.

Lower error bounds

Algorithm A₂(s):1. Label all elements as candidates.2. while there are more than s candidate

elements:a) Pick an arbitrary set of s candidate elements and

play them in a RR tournament. Let x have the most number of wins.

b) Compare x against all candidate elements and eliminate all elements that lose to x.

3. Play the final (at most s) candidate elements in a RR tournament and return the element with the most wins.

Max finding with error 2

Lemma 1. A₂ has error 2 and makes at most ns+(n^2)/s comparisons. With s=sqrt(n) we get at most 2n^(3/2) comparisons.

Proof. ◦ If x* is never eliminated x* participates in Step 3.

Theorem 1 ensures the error.◦ If x* was eliminated, it was by an element x s.t. x*-

x<=1 any element with value less then X*-2 was also eliminated in this iteration.

◦ Comparisons bound: In each iteration at least (s-1)/2 elements are eliminated.

Max finding with error 2

k-max-set is a set of elements that contains an element x such that x*-x≤k.

Lemma 2. the following algorithm performs a RR tournament and outputs a 1-max-set of size at most log(n).

After performing RR, the algorithm greedily picks an element which defeats as many thus-far undefeated elements as possible.

Max finding with error k

Algorithm 1-Cover:◦ Run A₂(s) with s=sqrt(n)/8.◦ Return the union of the x that were chosen in any

iteration of Step 2(a), in addition to the output of Lemma 2 on the elements in the final tournament in Step 3.

Lemma 3. 1-Cover finds a 1-max-set of size at most sqrt(n)/4 using O(n^(3/2)) comparisons.

Max finding with error k

Algorithm - Returns a k-max for k≥31. return

Algorithm - Returns a (k-1)-max set of size for k ≥ 21. if k=2 return2. else

1. Equipartition the n elements into t(n,k) sets.2. Recursively call on each set to recover (k-2)-

max–set .3. Return the output of 1-COVER with as input.

Max finding with error k

kA 2 1 1 2' , ,...,k nA A x x x

'kA

1 21 ( , ,..., )nCOVER x x x

1'kA

iT

1ti iT

2 / 3 2 4k k

O n

1

5 1

3

2 / 2 4/3

3/4 3 2 4 /4

1/ 2 3 / 4 10,

2

k k

k k

kk

t n k nelse

k=5

k=4

k=3

k=2

Max finding with error k

1S 2S

1S 2S 1S 2S 3S

1-max-set

2-max-set

3-max-set

5-max

n elements

n elements

2'A 2'A 2'A 2'A 2'A

3'A 3'A

4'A

5A

Theorem 3. For every 3≤k ≤ loglogn , the algorithm finds a k-max element using comparisons.

Corollary. There exists a max-finding algorithm using O(n) comparisons with error loglogn.

Max finding with error k

1 1/ 3/4 2 1k

O n

Lemma 5. In a RR tournament on n elements, the element with the median number of wins has at least (n-2)/4 wins and at least (n-2)/4 losses.

Sorting with error 2

Algorithm B₂:1. Modify A₂ so that the x found in Step 2(a) is a

pivot in the sense of Lemma 5.2. Compare x against all elements and pivot into

two sets.3. Recursively sort each of the two sets.

1. Lemma 6. Algorithm B₂ sorts with error 2 and requires at most O(n^(3/2)) comparisons.

◦ Error bound - trivial

Sorting with error 2

Analysis:◦ Every recursive call contains at least

elements at most iterations.

◦ In each iteration at most comparisons to find a median.

◦ Pivoting in each iteration: at most n comparisons.

◦ Sorting the base step: at most comparisons.

Sorting with error 2

2

4

n 4

2

n

n

2

n

2

n

3/24 4 1 42 2 8

22 12

n n n n nnn n n

n n n

Defenition 1. Element in a set of n elements is of k-ordrer i if there exists a partition S₁, S₂ of [n] such that:

A k-median is an element of k-order floor(n/2).

Selection with error k

jx

l j h

i

h jx x k h lx x k

Lemma 7. There exists a deterministic algorithm such that for any i in [n] and 2 ≤ k ≤ loglogn, the algorithm finds an element of k-order i in comparisons.

Selection with error k

/2 11 1/ 3 2 1k

O n

Algorithm - Returns an element of k-order i1. If k≤3, sort using B₂ and return the element with

index i.2. Equipartition the elements into sets 3. Recursively call on each set to get a (k-2)-

median 4. Play the in a RR tournament and let y be

the element with the median number of wins.5. Partition the elements according to y into X₁,X₂

1. If | X₁| = i-1 return i.2. Else if i≤| X₁| recursively find a k-order i in X₁.3. Else recursively find a k-order (i- | X₁| -1) in X₂.

Selection with error kkC

1,..., tS S2kC

iy1,..., ty y

Theorem 5. For any 2 ≤ k ≤ 2loglogn, there exists a deterministic sorting algorithm with error k using comparisons.

Algorithm:◦ Find an element x that is a k-median.◦ Equipartition the elements into sets S₁,S₂ such

that every element in S₂ is k-greater than every element in S₁∪{x}.

◦ Recursively sort each partition.

Sorting with error k

/2 11 1/ 3 2 1log

k

O n nk n

Theorem 6. Every deterministic max-finding algorithm with error k requires comparisons.◦ We saw an algorithm with

Theorem 7. Every deterministic algorithm which k-sorts n elements requires comparisons.◦ We saw an algorithm with

Lower Bounds

1 1/ 2 1k

n

1 1/ 3/4 2 1k

O n

11 1/2kn

/2 11 1/ 3 2 1log

k

O n nk n