class 8 - web.stanford.eduweb.stanford.edu/class/cs265/lectures/lecture8/inclass8.pdfhi txt. hi ly))...

20
Class 8 Locality Sensitive Hashing

Upload: others

Post on 23-Jan-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Class 8 - web.stanford.eduweb.stanford.edu/class/cs265/Lectures/Lecture8/InClass8.pdfhi txt. hi ly)) E 'has If angle(x, y) s Sr,-exp f-n k-ang""Nzr) ← expf-n' k-5k)-expf-n-312) in

Class 8Locality Sensitive Hashing

Page 2: Class 8 - web.stanford.eduweb.stanford.edu/class/cs265/Lectures/Lecture8/InClass8.pdfhi txt. hi ly)) E 'has If angle(x, y) s Sr,-exp f-n k-ang""Nzr) ← expf-n' k-5k)-expf-n-312) in

𝑐-nearest neighbors

ccitnearestneighborsao

÷÷÷. can.. .mind ,

S = {Xi , Xz , . . . , Xn 3 E IdGiven y , And Xj so that

Hy - xj Ha s c ( min Hy- Xi tht e )

ace = (d. n)OH)

• Query time = o(n)

ccitnearestneighborsao

÷÷÷. can.. .mind ,

S = {Xi , Xz , . . . , Xn 3 E IdGiven y , And Xj so that

Hy - xj Ha s c ( min Hy- Xi tht e )

ace = (d. n)OH)

• Query time = o(n)

ccitnearestneighborsao

÷÷÷. can.. .mind ,

S = {Xi , Xz , . . . , Xn 3 E IdGiven y , And Xj so that

Hy - xj Ha s c ( min Hy- Xi tht e )

ace = (d. n)OH)

• Query time = o(n)

ccitnearestneighborsao

÷÷÷. can.. .mind ,

S = {Xi , Xz , . . . , Xn 3 E IdGiven y , And Xj so that

Hy - xj Ha s c ( min Hy- Xi tht e )

ace = (d. n)OH)

• Query time = o(n)(For today all of our points live on the unit sphere.)

Page 3: Class 8 - web.stanford.eduweb.stanford.edu/class/cs265/Lectures/Lecture8/InClass8.pdfhi txt. hi ly)) E 'has If angle(x, y) s Sr,-exp f-n k-ang""Nzr) ← expf-n' k-5k)-expf-n-312) in

𝑐-near-neighbors

r = min!

𝑥" − 𝑦 #

𝑐𝑟

𝑥"

𝑥$

𝑦

Okay to return this.

Imagine that this slide is the

surface of the unit sphere….

Page 4: Class 8 - web.stanford.eduweb.stanford.edu/class/cs265/Lectures/Lecture8/InClass8.pdfhi txt. hi ly)) E 'has If angle(x, y) s Sr,-exp f-n k-ang""Nzr) ← expf-n' k-5k)-expf-n-312) in

Today: 𝑟, 𝑐 -near-neighbors

ccitnearestneighborsao

÷÷÷. can.. .mind ,

S = {Xi , Xz , . . . , Xn 3 E IdGiven y , And Xj so that

Hy - xj Ha s c ( min Hy- Xi tht e )

ace = (d. n)OH)

• Query time = o(n)

Given y so that

mini Hy- xilla Er

find Xj so that Hy - xjlla E C. r

Given y so that

mini Hy- xilla Er

find Xj so that Hy - xjlla E C. r

Given y so that

mini Hy- xilla Er

find Xj so that Hy - xjlla E C. r

ccitnearestneighborsao

÷÷÷. can.. .mind ,

S = {Xi , Xz , . . . , Xn 3 E IdGiven y , And Xj so that

Hy - xj Ha s c ( min Hy- Xi tht e )

ace = (d. n)OH)

• Query time = o(n)(For today all of our points live on the unit sphere.)

ccitnearestneighborsao

÷÷÷. can.. .mind ,

S = {Xi , Xz , . . . , Xn 3 E IdGiven y , And Xj so that

Hy - xj Ha s c ( min Hy- Xi tht e )

ace = (d. n)OH)

• Query time = o(n)

ccitnearestneighborsao

÷÷÷. can.. .mind ,

S = {Xi , Xz , . . . , Xn 3 E IdGiven y , And Xj so that

Hy - xj Ha s c ( min Hy- Xi tht e )

ace = (d. n)OH)

• Query time = o(n)

Before:

Page 5: Class 8 - web.stanford.eduweb.stanford.edu/class/cs265/Lectures/Lecture8/InClass8.pdfhi txt. hi ly)) E 'has If angle(x, y) s Sr,-exp f-n k-ang""Nzr) ← expf-n' k-5k)-expf-n-312) in

(𝑟, 𝑐)-near-neighbors

𝑟

𝑐𝑟

𝑥"

𝑥$

𝑦

Okay to return this.

Imagine that this slide is the

surface of the unit sphere….

Page 6: Class 8 - web.stanford.eduweb.stanford.edu/class/cs265/Lectures/Lecture8/InClass8.pdfhi txt. hi ly)) E 'has If angle(x, y) s Sr,-exp f-n k-ang""Nzr) ← expf-n' k-5k)-expf-n-312) in

Fact

• If you can solve (𝑟, 𝑐)-nearest neighbors then you can (basically) solve 𝑐-nearest neighbors.

• (See lecture notes).

Page 7: Class 8 - web.stanford.eduweb.stanford.edu/class/cs265/Lectures/Lecture8/InClass8.pdfhi txt. hi ly)) E 'has If angle(x, y) s Sr,-exp f-n k-ang""Nzr) ← expf-n' k-5k)-expf-n-312) in

Goal for today

• A solution to (𝑟, 𝑐)-approximate nearest neighbors.

• Tool: Locality-Sensitive Hashing.• Points that are near to each other have a good probability of colliding.• Points that are far from each other are unlikely to collide.

• Strategy:• Data structure: hash all the 𝑥+• To query, hash 𝑦. Return anything in 𝑦’s bucket.

Our strategy will actually be slightly more complicated

than this, but this is the basic idea…

Page 8: Class 8 - web.stanford.eduweb.stanford.edu/class/cs265/Lectures/Lecture8/InClass8.pdfhi txt. hi ly)) E 'has If angle(x, y) s Sr,-exp f-n k-ang""Nzr) ← expf-n' k-5k)-expf-n-312) in

Our Locality Sensitive Hashing Scheme

• Let 𝐴 ∈ ℝ!×# have i.i.d. 𝑁(0,1) entries.• Here, 𝑘 = , -./ 0

12(we’ll see why later).

• Define ℎ 𝑥 = sign(𝐴𝑥)

𝐴𝑥

0.2−1.30.73.2−500.01

𝐴𝑥

=+1−1+1+1−1+1

ℎ(𝑥)

Page 9: Class 8 - web.stanford.eduweb.stanford.edu/class/cs265/Lectures/Lecture8/InClass8.pdfhi txt. hi ly)) E 'has If angle(x, y) s Sr,-exp f-n k-ang""Nzr) ← expf-n' k-5k)-expf-n-312) in

Actually choose 𝑠 independent copies of this

• Choose 𝑠 = 𝑛

• Choose 𝑘 = $ %&' ()*

• For 𝑖 = 1,… , 𝑠:• Let 𝐴+ ∈ ℝ!×# have i.i.d. 𝑁(0,1) entries.• Define ℎ+ 𝑥 = sign(𝐴+𝑥)

Page 10: Class 8 - web.stanford.eduweb.stanford.edu/class/cs265/Lectures/Lecture8/InClass8.pdfhi txt. hi ly)) E 'has If angle(x, y) s Sr,-exp f-n k-ang""Nzr) ← expf-n' k-5k)-expf-n-312) in

Outline of group work

• First (problems 1-5) you will show that:• If 𝑥, 𝑦 are close, then probably there’s some 𝑖 so that ℎ+ 𝑥 = ℎ+(𝑦)• If 𝑥, 𝑦 are far, then probably there’s no such 𝑖.

• Then (problems 6,7), you will show how to use this to get a (𝑐, 𝑟)-near-neighbors scheme.

Page 11: Class 8 - web.stanford.eduweb.stanford.edu/class/cs265/Lectures/Lecture8/InClass8.pdfhi txt. hi ly)) E 'has If angle(x, y) s Sr,-exp f-n k-ang""Nzr) ← expf-n' k-5k)-expf-n-312) in

Group work!

Page 12: Class 8 - web.stanford.eduweb.stanford.edu/class/cs265/Lectures/Lecture8/InClass8.pdfhi txt. hi ly)) E 'has If angle(x, y) s Sr,-exp f-n k-ang""Nzr) ← expf-n' k-5k)-expf-n-312) in

Question 1

• For each row 𝑎, of 𝐴, we have a hyperplane 𝑥 ∈ ℝ# ∶ 𝑎,𝑥 = 0 .• If the corresponding coordinate of 𝐴𝑥 is negative, then 𝑥 lies on one side

of the hyperplane, else 𝑥 lies on the other.• Same cell = same side of every hyperplane = same sign in every coordinate.

𝐴

𝑥

0.2−1.30.73.2−500.01

𝐴𝑥

=

+1−1+1+1−1+1ℎ(𝑥)

÷÷÷hiCxI * hi ly ) ⇒ some

hyprplanedoesth§

𝑎%

Page 13: Class 8 - web.stanford.eduweb.stanford.edu/class/cs265/Lectures/Lecture8/InClass8.pdfhi txt. hi ly)) E 'has If angle(x, y) s Sr,-exp f-n k-ang""Nzr) ← expf-n' k-5k)-expf-n-312) in

Question 2Pr ℎ! 𝑥 = ℎ! 𝑥 = 1 − 𝑎𝑛𝑔𝑙𝑒(𝑥, 𝑦)/𝜋 "

÷÷÷hiCxI * hi ly ) ⇒ some

hyprplanedoesth§Pf sodYeshHIIane ) = arc length of

.

.

I

= anglelx.ggIT

Ipf noneof thek

hyperplanes do that } -- ( f - ang¥Jk

Pf ti, hiwth ily ) } = ( f - ( t - angk¥I)k I

Pf sodYeshHIIane ) = arc length of.

.

I

= anglelx.ggIT

Ipf noneof thek

hyperplanes do that } -- ( f - ang¥Jk

Pf ti, hiwth ily ) } = ( f - ( t - angk¥I)k I

Pf sodYeshHIIane ) = arc length of.

.

I

= anglelx.ggIT

Ipf noneof thek

hyperplanes do that } -- ( f - ang¥Jk

Pf ti, hiwth ily ) } = ( f - ( t - angk¥I)k I

Page 14: Class 8 - web.stanford.eduweb.stanford.edu/class/cs265/Lectures/Lecture8/InClass8.pdfhi txt. hi ly)) E 'has If angle(x, y) s Sr,-exp f-n k-ang""Nzr) ← expf-n' k-5k)-expf-n-312) in

Question 3

Pf sodYeshHIIane ) = arc length of.

.

I

= anglelx.ggIT

Ipf noneof thek

hyperplanes do that } -- ( f - ang¥Jk

Pf ti, hiwth ily ) } = ( f - ( t - angk¥I)k I

Pr ℎ" 𝑥 = ℎ"(𝑦) for any one 𝑖

Pr ℎ" 𝑥 ≠ ℎ"(𝑦) for any one 𝑖

Pr ℎ" 𝑥 ≠ ℎ"(𝑦) for all s values of 𝑖

Page 15: Class 8 - web.stanford.eduweb.stanford.edu/class/cs265/Lectures/Lecture8/InClass8.pdfhi txt. hi ly)) E 'has If angle(x, y) s Sr,-exp f-n k-ang""Nzr) ← expf-n' k-5k)-expf-n-312) in

Question 4With our choice of s - in

,k , ttlzorgn ,

k Sm- rn

PEti , hilxkhilyl } = H- H - an,¥a§E* )rn

i ( I.

-Ixpf-togcmjangwx.gl))

"

exp f- rn .n- anyway

= exp ( - nk - anywayKr )

If angklx , y ) Er,

P{fi , hiWthily)} a

exp f - nk - and""Tkr ) z exp f - 1) , so

P { Fi , hi HI -- hi ly I } 3 I - ke .

With our choice of s - in,k , ttlzorgn ,

k Sm- rn

PEti , hilxkhilyl } = H- H - an,¥a§E* )rn

i ( I.

-Ixpf-togcmjangwx.gl))

"

exp f- rn .n- anyway

= exp ( - nk - anywayKr )

If angklx , y ) Er,

P{fi , hiWthily)} a

exp f - nk - and""Tkr ) z exp f - 1) , so

P { Fi , hi HI -- hi ly I } 3 I - ke .

Page 16: Class 8 - web.stanford.eduweb.stanford.edu/class/cs265/Lectures/Lecture8/InClass8.pdfhi txt. hi ly)) E 'has If angle(x, y) s Sr,-exp f-n k-ang""Nzr) ← expf-n' k-5k)-expf-n-312) in

With our choice of s - in,k , ttlzorgn ,

k Sm- rn

PEti , hilxkhilyl } = H- H - an,¥a§E* )rn

i ( I.

-Ixpf-togcmjangwx.gl))

"

exp f- rn .n- anyway

= exp ( - nk - anywayKr )

If angklx , y ) Er,

P{fi , hiWthily)} a

exp f - nk - and""Tkr ) z exp f - 1) , so

P { Fi , hi HI -- hi ly I } 3 I - ke .

Question 5(a)

With our choice of s - in,k , ttlzorgn ,

k Sm- rn

PEti , hilxkhilyl } = H- H - an,¥a§E* )rn

i ( I.

-Ixpf-togcmjangwx.gl))

"

exp f- rn .n- anyway

= exp ( - nk - anywayKr )

If angklx , y ) Er,

P{fi , hiWthily)} a

exp f - nk - and""Tkr ) z exp f - 1) , so

P { Fi , hi HI -- hi ly I } 3 I - ke .

With our choice of s - in,k , ttlzorgn ,

k Sm- rn

PEti , hilxkhilyl } = H- H - an,¥a§E* )rn

i ( I.

-Ixpf-togcmjangwx.gl))

"

exp f- rn .n- anyway

= exp ( - nk - anywayKr )

If angklx , y ) Er,

P{fi , hiWthily)} a

exp f - nk - and""Tkr ) z exp f - 1) , so

P { Fi , hi HI -- hi ly I } 3 I - ke .

With our choice of s - in,k , ttlzorgn ,

k Sm- rn

PEti , hilxkhilyl } = H- H - an,¥a§E* )rn

i ( I.

-Ixpf-togcmjangwx.gl))

"

exp f- rn .n- anyway

= exp ( - nk - anywayKr )

If angklx , y ) Er,

P{fi , hiWthily)} a

exp f - nk - and""Tkr ) z exp f - 1) , so

P { Fi , hi HI -- hi ly I } 3 I - ke .

With our choice of s - in,k , ttlzorgn ,

k Sm- rn

PEti , hilxkhilyl } = H- H - an,¥a§E* )rn

i ( I.

-Ixpf-togcmjangwx.gl))

"

exp f- rn .n- anyway

= exp ( - nk - anywayKr )

If angklx , y ) Er,

P{fi , hiWthily)} a

exp f - nk - and""Tkr ) z exp f - 1) , so

P { Fi , hi HI -- hi ly I } 3 I - ke .

Page 17: Class 8 - web.stanford.eduweb.stanford.edu/class/cs265/Lectures/Lecture8/InClass8.pdfhi txt. hi ly)) E 'has If angle(x, y) s Sr,-exp f-n k-ang""Nzr) ← expf-n' k-5k)-expf-n-312) in

Question 5(b)

If angle ( x , y) s Sr,-

exp f - nk - ang""Nzr ) ← exp f -n'

k - 5k )- expf - n

- 312 )in L - 11h12

Plz i , hi Hl -- hi ly) ) E 4ns's

Replacing angleCx , y) with HA- ylk ,

we have

• If ang¥# P { Fi , hi HI - hi ly 133 I - ke .

" x - y 112 ELI . r

• If angle ( x , y) s Sr, P{ Ji , hi txt . hi ly) ) E'

has

If angle ( x , y) s Sr,-

exp f - nk - ang""Nzr ) ← exp f -n'

k - 5k )- expf - n

- 312 )in L - 11h12

Plz i , hi Hl -- hi ly) ) E 4ns's

Replacing angleCx , y) with HA- ylk ,

we have

• If ang¥# P { Fi , hi HI - hi ly 133 I - ke .

" x - y 112 ELI . r

• If angle ( x , y) s Sr, P{ Ji , hi txt . hi ly) ) E'

has

If angle ( x , y) s Sr,-

exp f - nk - ang""Nzr ) ← exp f -n'

k - 5k )- expf - n

- 312 )in L - 11h12

Plz i , hi Hl -- hi ly) ) E 4ns's

Replacing angleCx , y) with HA- ylk ,

we have

• If ang¥# P { Fi , hi HI - hi ly 133 I - ke .

" x - y 112 ELI . r

• If angle ( x , y) s Sr, P{ Ji , hi txt . hi ly) ) E'

has

2

2

2

Page 18: Class 8 - web.stanford.eduweb.stanford.edu/class/cs265/Lectures/Lecture8/InClass8.pdfhi txt. hi ly)) E 'has If angle(x, y) s Sr,-exp f-n k-ang""Nzr) ← expf-n' k-5k)-expf-n-312) in

Question 6

• Query Algorithm: • For 𝑖 = 1,2, … , 𝑠:• Compute ℎ+(𝑦)• If there’s some 𝑥; so that ℎ+ 𝑥; = ℎ+(𝑦), return it.

• If angle(𝑦, 𝑥ℓ) ≤ 𝑟, then with decent probability there’s some 𝑖 so that ℎ+ 𝑥ℓ = ℎ+(𝑦). • In particular, the algorithm will return something.

• If the algorithm returns 𝑥; then with high probability angle 𝑦, 𝑥; ≤ 5𝑟.• If angle 𝑦, 𝑥! > 5𝑟, Pr ∃𝑖, ℎ" 𝑥! = ℎ"(𝑦) ≤ #

$!, and we can union bound over all 𝑥!

to say that never happens whp.

Page 19: Class 8 - web.stanford.eduweb.stanford.edu/class/cs265/Lectures/Lecture8/InClass8.pdfhi txt. hi ly)) E 'has If angle(x, y) s Sr,-exp f-n k-ang""Nzr) ← expf-n' k-5k)-expf-n-312) in

Question 7

If angle ( x , y) s Sr,-

exp f - nk - ang""Nzr ) ← exp f -n'

k - 5k )- expf - n

- 312 )in L - 11h12

Plz i , hi Hl -- hi ly) ) E 4ns's

Replacing angleCx , y) with HA- ylk ,

we have

• If ang¥# P { Fi , hi HI - hi ly 133 I - ke .

" x - y 112 ELI . r

• If angle ( x , y) s Sr, P{ Ji , hi txt . hi ly) ) E'

has

If angle ( x , y) s Sr,-

exp f - nk - ang""Nzr ) ← exp f -n'

k - 5k )- expf - n

- 312 )in L - 11h12

Plz i , hi Hl -- hi ly) ) E 4ns's

Replacing angleCx , y) with HA- ylk ,

we have

• If ang¥# P { Fi , hi HI - hi ly 133 I - ke .

" x - y 112 ELI . r

• If angle ( x , y) s Sr, P{ Ji , hi txt . hi ly) ) E'

has

𝑥 − 𝑦 % ≥ 5𝑟 So just fiddle with the value of “c” and the same analysis will still work.

Using )$angle(𝑥, 𝑦) ≤ x − y

)≤ angle(𝑥, 𝑦), we can conclude:

2

Page 20: Class 8 - web.stanford.eduweb.stanford.edu/class/cs265/Lectures/Lecture8/InClass8.pdfhi txt. hi ly)) E 'has If angle(x, y) s Sr,-exp f-n k-ang""Nzr) ← expf-n' k-5k)-expf-n-312) in

Wrapping up: Time and Space

• Space:• 𝑠 different 𝑘 × 𝑑 matrices 𝐴+: 𝑂 𝑑 ⋅ 𝑛 ⋅ log 𝑛• 𝑠 hash tables, each with 2= buckets: 𝑂 𝑛 ⋅ 2> -./ 0 = 𝑛>(?)

• The elements of 𝑆 themselves: 𝑂 𝑛𝑑

• Update time:• 𝑠 different 𝑘 × 𝑑 matrix-vector multiplies: 𝑂 𝑠 ⋅ 𝑘 ⋅ 𝑑 = 𝑂 𝑑 𝑛 ⋅ log 𝑛• Going through all 𝑠 hash tables and look in ℎ+(𝑦)’s bucket to see if there’s

anything else: 𝑂 𝑠 = 𝑂 𝑛

𝑘 = 𝑂(log 𝑛)𝑠 = 𝑛

𝑛#(%)

𝑂 𝑑 ⋅ 𝑛 ⋅ log 𝑛 = 𝑜(𝑛) when 𝑑 isn’t too big.