an impossibility theorem for clustering by jon kleinberg

23
An Impossibility Theorem for Clustering By Jon Kleinberg

Upload: betty-parks

Post on 28-Dec-2015

253 views

Category:

Documents


10 download

TRANSCRIPT

Page 1: An Impossibility Theorem for Clustering By Jon Kleinberg

An Impossibility Theorem for Clustering

By Jon Kleinberg

Page 2: An Impossibility Theorem for Clustering By Jon Kleinberg

Definitions Clustering function: operates on a set S of

more than 2 points and the distances among them

where is a partition of S Distance function:

the distance is 0 only for d(i,i) Does not require the triangle inequality.

RSSd :

),( dSf

Page 3: An Impossibility Theorem for Clustering By Jon Kleinberg

Many different clustering criteria

k-center k-median k-means Inter-Intra etc

Page 4: An Impossibility Theorem for Clustering By Jon Kleinberg

k-Center

Minimize maximum distance

Page 5: An Impossibility Theorem for Clustering By Jon Kleinberg

k-median

Minimize average distance

k-means: minimize distance squared

Page 6: An Impossibility Theorem for Clustering By Jon Kleinberg

Inter-Intra

T(C)

D(C)

Maximize D(C) – T(C)

Page 7: An Impossibility Theorem for Clustering By Jon Kleinberg

Motivation

Each criterion optimizes different features

Is there one clustering criterion with phenomenal cosmic powers?

Page 8: An Impossibility Theorem for Clustering By Jon Kleinberg

Method

Give three intuitive axioms that any criterion should satisfy

Surprise: Not possible to satisfy all three

Reminiscent of Arrow’s Impossibility theorem: ranking is impossible

Page 9: An Impossibility Theorem for Clustering By Jon Kleinberg

Axiom 1 – Scale-Invariance For any distance function d and any β >0 we have

that f(S,d)=f(S,βd)

Page 10: An Impossibility Theorem for Clustering By Jon Kleinberg

Axiom 2 - Richness Range(f) is equal to all partitions of S

i.e. All possible clusterings can be generated given the right distances

Page 11: An Impossibility Theorem for Clustering By Jon Kleinberg

Axiom 3 - Consistency Let d and d’ be two distance functions. If

f(d) = and d’ is such that the distance between all points in a cluster is less than in d and the distance between inter-cluster points is larger than in d then f(d’)=

d(i,j)

d(i,j)d’(i,j)

d’(i,j)

Page 12: An Impossibility Theorem for Clustering By Jon Kleinberg

Definition

Anti-chain: A collection of partitions is an anti-chain if it does not contain two distinct partitions such that one is a refinement of the other

Anti-Chains can not satisfy Richness

Page 13: An Impossibility Theorem for Clustering By Jon Kleinberg

Main Result For each , there is no clustering

function f that satisfies Scale-Invariance, Richness and Consistency

Implied by proof that if f satisfies Scale-Invariance and Consistency, then Range(f) is an anti-chain

2n

Page 14: An Impossibility Theorem for Clustering By Jon Kleinberg

Reminder of Axioms Scale-Invariance: For any distance

function d and any β >0 we have that f(d)=f(β d)

Richness: Range(f) is equal to all partitions of S

Consistency: Let d and d’ be two distance functions. If f(d) = and d’ is such that the distance between all points in a cluster is less than in d and the distance between inter-cluster points is larger than in d then f(d’)=

Page 15: An Impossibility Theorem for Clustering By Jon Kleinberg

Single Linkage

Cluster by combining the closest points

0 1 4 9 10 12 15 19 20

Page 16: An Impossibility Theorem for Clustering By Jon Kleinberg

Any two axioms For every pair of axioms, there is a

stopping condition for single linkage

Consistency + Richness: only link if distance is less than r

Consistency + SI: stop when you have k connected components

Richness + SI: if x is the diameter of the graph, only add edges with weight βx

Page 17: An Impossibility Theorem for Clustering By Jon Kleinberg

Centroid-Based Clustering (k,g)-centroid clustering function: Choose

T, a set of k centroid points such that is minimized

If g is identity, we get k-median, etc.

Result: For every and every function g and n significantly larger than k the (k,g)-centroid clustering function does not satisfy consistency.

)),(( TidgSi

2k

Page 18: An Impossibility Theorem for Clustering By Jon Kleinberg

Proof: A contradiction

r

r+δ

ε

X (size m)Y (size λm)

)()()),(( mgrmgTidg

Page 19: An Impossibility Theorem for Clustering By Jon Kleinberg

A new distance function

r’r+δ

ε

Y (size λm)

)()'()),(( rmgrmgTidg

X0 (size m/2)

r’

r

r+δ

X1 (size m/2)

r’ < r

Page 20: An Impossibility Theorem for Clustering By Jon Kleinberg

Wrapping Up If we pick λ, r, r’, ε and δ right then we can

have:

But then our new centers are in X0 and X1

But our new distance followed consistency, so it should give us X and Y.

This covers the case where k is 2.

)()'()()( rmgrmgmgrmg

Page 21: An Impossibility Theorem for Clustering By Jon Kleinberg

Discussion: Relaxing Axioms Refinement-consistency: if d’ is an f(d)-

transformation of d, then f(d’) is a refinement of f(d) Near-Richness: all partitions except the trivial

one can be obtained

These together allow a function that satisfies these replacements.

What other relaxations could we have?

Page 22: An Impossibility Theorem for Clustering By Jon Kleinberg

Discussion Does this mean there is a law of continuous

employment for clustering criterion creators?

Is the clustering function properly defined? Allow overlaps Allow outliers

Are these the right axioms? All partitions possible vs. power set

Axioms for graph clustering?

Page 23: An Impossibility Theorem for Clustering By Jon Kleinberg

Questions?