when do noisy votes reveal the truth? ioannis caragiannis 1 ariel d. procaccia 2

When Do Noisy Votes Reveal the Truth? Ioannis Caragiannis1

Ariel D. Procaccia2

Nisarg Shah2 ( speaker )1 University of Patras & CTI2 Carnegie Mellon University

2

What? Why?• What?

Alternatives to be compared True order (unknown ground truth) Noisy estimates (votes) drawn from

some distribution around it Q: How many votes are needed to

accurately find the true order?

• Why? Practical motivation Theoretical motivation

a > b > c > d

b > a > c > d

a > c > b > d

a > b > d > c

Alternativesa, b, c, d

Practical Motivation1. Human Computation

EteRNA, Foldit, Crowdsourcing …

How many users/workers are required?

2. Judgement Aggregation Jury system, experts ranking

restaurants, … How many experts are

required?

4

Theoretical Motivation• Maximum Likelihood Estimator

(MLE) View: Is a given voting rule the MLE for any noise model?

• Problems Only 1 MLE/noise model Strange noise models Noise model is usually unknown

• Our Contribution MLE is too stringent! Just want low sample complexity Family of reasonable noise models

VotingRules

NoiseModel

s

5

Boring Stuff!• Voting rule ()

Input several rankings of alternatives Social choice function (traditionally) : Output a winning alternative Social welfare function (this work) : Output a ranking of alternatives

• Noise model over rankings () For every ground truth and every ranking σ Mallows’ model :

= Kendall-Tau distance = #pairwise comparisons two rankings disagree on

• Sample complexity of rule for model and accuracy Smallest For every σ*,

6

Sample Complexity for Mallows’ Model• Kemeny rule (+ any tie-breaking) = MLE

• Theorem: Kemeny rule + uniformly random tie-breaking = optimal sample complexity for Mallows’ model, any accuracy.

Subtlety: MLE does not always imply optimal sample complexity!

• So, are the other voting rules really bad for Mallows’ model? No.

7

PM-c and PD-c Rules• Pairwise Majority Consistent Rules (PM-c)

Must match the pairwise majority graph whenever it is acyclic Condorcet consistency for social welfare functions

𝑎≻𝑏≻𝑐≻𝑑 a

b

c

d𝑎≻𝑏≻𝑐≻𝑑

8

PM-c and PD-c Rules• PD-c rules similar, but focus on positions of alternatives

PM-c PD-cKM

SL

CPRP

SCBL

PSR

9

The Big Picture

Kemeny rule + uniform tie breaking

Optimal sample complexity

PM-c

PM-c O(log m) (m = #alternatives) Any voting rule Ω(log m)

Logarithmic

Polynomial

Exponential

Many scoring rules

Plurality, veto Strictly exponential

10

Take-Away - I

Given any fixed noise model, sample complexity is a clear and useful criterion for selecting voting rules

• Hey, what happened to the noise model being unknown?

11

Generalization• Stronger need Unknown noise model

Working well on a family of reasonable noise models

• Problems1. What is reasonable?2. HUGE sample complexity for near-extreme parameter values!

• Relaxation Accuracy in the LimitGround truth with probability 1 given infinitely many samples

• Novel axiomatic property

12

Accuracy in the Limit

Voting Rules

Noise models for which they are accurate in the limit

PM-c + PD-c Mallows’ model(probability decreases exponentially in the KT distance)

PM-c + PD-c All KT-monotonic noise models(probability decreases monotonically in the KT distance)

PM-c All d-monotonic iff d = Majority Concentric (MC)PD-c All d-monotonic iff d = Position Concentric (PC)PM-c + PD-c All d-monotonic iff d = both MC and PC

Monotonicity is reasonable, but why Kendall-Tau distance?

13

Take-Away - II Robustness accuracy in the limit over a family of reasonable

noise models

d-monotonic noise models reasonable If you believe in PM-c and PD-c rules look for distances that are

both MC and PC Kendall-Tau, footrule, maximum displacement

Cayley distance and Hamming distance are neither MC nor PC Even the most popular rule – plurality – is not accurate in the limit for any

monotonic noise model over either distance ! Lose just too much information for the true ranking to be recovered

14

Distances over Rankings• MC (Majority-Concentric) Distance

Ranking , distance For every pairwise comparison, a (weak) majority of rankings in every

must agree with

σ*

𝑎≻𝑏?

σ*

𝑐≻𝑑?

15

Discussion1. The stringent MLE requirement sample complexity

Connections to axiomatic and distance rationalizability views?

2. Noise model unknown d-monotonic noise models Some distances over rankings are better suited for voting than others

(e.g., MC and PC distances) An extensive study of the applicability of various distance metrics in

social choice

3. Practical applications Extension to voting with partial information - pairwise comparisons, partial orders, top- lists

when do noisy votes reveal the truth? ioannis caragiannis 1 ariel d. procaccia 2

Documents