![Page 1: Beyond Set Disjointness : The Communication Complexity of Finding the Intersection](https://reader035.vdocument.in/reader035/viewer/2022062520/56816024550346895dcf297a/html5/thumbnails/1.jpg)
Beyond Set Disjointness: The Communication Complexity of Finding the
Intersection
Grigory Yaroslavtsevhttp://grigory.us
Joint with Brody, Chakrabarti, Kondapally and Woodruff
![Page 2: Beyond Set Disjointness : The Communication Complexity of Finding the Intersection](https://reader035.vdocument.in/reader035/viewer/2022062520/56816024550346895dcf297a/html5/thumbnails/2.jpg)
Communication Complexity [Yaoโ79]
Alice: Bob:
๐ (๐ ,๐ )=?
Shared randomness
โฆ๐ (๐ ,๐ )
โข = min. communication (error ) โข min. -round communication (error )
![Page 3: Beyond Set Disjointness : The Communication Complexity of Finding the Intersection](https://reader035.vdocument.in/reader035/viewer/2022062520/56816024550346895dcf297a/html5/thumbnails/3.jpg)
Set Intersection
๐=๐ , ๐=๐ , ๐ (๐ , ๐ )=๐โฉ๐๐บโ [๐ ] ,|๐|โค๐ ๐ป โ [๐ ] ,|๐|โค๐ = ?
(-Intersection) = ?
![Page 4: Beyond Set Disjointness : The Communication Complexity of Finding the Intersection](https://reader035.vdocument.in/reader035/viewer/2022062520/56816024550346895dcf297a/html5/thumbnails/4.jpg)
This talk
Let
โข (-Intersection) = [Brody, Chakrabarti, Kondapally, Woodruff, Y.; PODCโ14]โข (-Intersection) = [Saglam-Tardos FOCSโ13; Brody, Chakrabarti, Kondapally, Woodruff, Y.โ13]
{
times
(-Intersection) = for
![Page 5: Beyond Set Disjointness : The Communication Complexity of Finding the Intersection](https://reader035.vdocument.in/reader035/viewer/2022062520/56816024550346895dcf297a/html5/thumbnails/5.jpg)
-Disjointnessโข , iff โข [Razborovโ92; Hastad-Wigdersonโ96] โข [Folklore + Dasgupta, Kumar, Sivakumar; Buhrmanโ12, Garcia-Soriano, Matsliah, De Wolfโ12]
โข [Saglam, Tardosโ13]โข [Braverman, Garg, Pankratov, Weinsteinโ13]
![Page 6: Beyond Set Disjointness : The Communication Complexity of Finding the Intersection](https://reader035.vdocument.in/reader035/viewer/2022062520/56816024550346895dcf297a/html5/thumbnails/6.jpg)
Applications
โข : exact Jaccard index ( for -approximate use MinHash [Broderโ98; Li-Konigโ11; Path-Strokel-Woodruffโ14])โข Rarity, distinct elements, joins,โฆโข Multi-party set intersection (later)โข Contrast:
![Page 7: Beyond Set Disjointness : The Communication Complexity of Finding the Intersection](https://reader035.vdocument.in/reader035/viewer/2022062520/56816024550346895dcf297a/html5/thumbnails/7.jpg)
1-round -protocol
๐ : [๐ ]โ[๐3]
๐บ ๐ป
๐(๐บ) ๐(๐ป )
[๐ ] [๐ ]
[๐3] [๐3]
![Page 8: Beyond Set Disjointness : The Communication Complexity of Finding the Intersection](https://reader035.vdocument.in/reader035/viewer/2022062520/56816024550346895dcf297a/html5/thumbnails/8.jpg)
Hashing
log ๐
=# of buckets
๐ : [๐ ]โ[๐ / log๐]
Expected # of elements
![Page 9: Beyond Set Disjointness : The Communication Complexity of Finding the Intersection](https://reader035.vdocument.in/reader035/viewer/2022062520/56816024550346895dcf297a/html5/thumbnails/9.jpg)
Secondary Hashing
= # of hash functions
log 3๐ where
![Page 10: Beyond Set Disjointness : The Communication Complexity of Finding the Intersection](https://reader035.vdocument.in/reader035/viewer/2022062520/56816024550346895dcf297a/html5/thumbnails/10.jpg)
2-Round -protocol
log 3๐
log 3๐
|h๐ (๐บ )|=|h๐ (๐ป )|=๐ ( log๐ log log๐ )
Total communication = = O()
![Page 11: Beyond Set Disjointness : The Communication Complexity of Finding the Intersection](https://reader035.vdocument.in/reader035/viewer/2022062520/56816024550346895dcf297a/html5/thumbnails/11.jpg)
Collisions
๐log๐
log 3๐Pr [๐๐๐๐๐๐ ๐๐๐ ]=๐(1log๐ )
![Page 12: Beyond Set Disjointness : The Communication Complexity of Finding the Intersection](https://reader035.vdocument.in/reader035/viewer/2022062520/56816024550346895dcf297a/html5/thumbnails/12.jpg)
Collisions
log 3๐
log 3๐
![Page 13: Beyond Set Disjointness : The Communication Complexity of Finding the Intersection](https://reader035.vdocument.in/reader035/viewer/2022062520/56816024550346895dcf297a/html5/thumbnails/13.jpg)
Collisions
โข Second round: โ For each bucket send -bit equality check (total -
communication)โ Correct intersection computed in buckets where
โ Expected # of items in incorrect buckets โ Use 1-round protocol for incorrect bucketsโ Total communication
![Page 14: Beyond Set Disjointness : The Communication Complexity of Finding the Intersection](https://reader035.vdocument.in/reader035/viewer/2022062520/56816024550346895dcf297a/html5/thumbnails/14.jpg)
Main protocol
๐ (1)
=# of buckets
๐ : [๐ ]โ[๐]
Expected # of elements
![Page 15: Beyond Set Disjointness : The Communication Complexity of Finding the Intersection](https://reader035.vdocument.in/reader035/viewer/2022062520/56816024550346895dcf297a/html5/thumbnails/15.jpg)
Verification tree -degree
โฆlog ๐โ 1๐
buckets = leaves of the verification tree
![Page 16: Beyond Set Disjointness : The Communication Complexity of Finding the Intersection](https://reader035.vdocument.in/reader035/viewer/2022062520/56816024550346895dcf297a/html5/thumbnails/16.jpg)
Verification bottom-up
๐บ๐โ ,๐๐
โ ๐บ๐โ ,๐๐
โ
๐บ๐โโช๐บ๐ ,๐๐
โโช๐ป ๐
๐บ๐โโฉ๐๐
โ๐บ๐โโฉ๐๐
โ
(๐บ๐โโช๐บ๐ )โฉ(๐ ยฟยฟ๐โโช๐ป ๐)ยฟ
![Page 17: Beyond Set Disjointness : The Communication Complexity of Finding the Intersection](https://reader035.vdocument.in/reader035/viewer/2022062520/56816024550346895dcf297a/html5/thumbnails/17.jpg)
EQ()
Verification bottom-up
๐บ๐โโฉ๐๐
โ๐บ๐โโฉ๐๐
โ
(๐บ๐โโช๐บ๐ )โฉ(๐ ยฟยฟ๐โโช๐ป ๐)ยฟ
Correct Incorrect
Incorrect
๐บ๐โโฉ๐๐
โ๐บ๐โโฉ๐๐
โ
(๐บ๐โโช๐บ๐ )โฉ(๐ ยฟยฟ๐โโช๐ป ๐)ยฟ
Correct Incorrect
![Page 18: Beyond Set Disjointness : The Communication Complexity of Finding the Intersection](https://reader035.vdocument.in/reader035/viewer/2022062520/56816024550346895dcf297a/html5/thumbnails/18.jpg)
Correct
EQ()
Verification bottom-up
๐บ๐โโฉ๐๐
โ๐บ๐โโฉ๐๐
โ
(๐บ๐โโช๐บ๐ )โฉ(๐ ยฟยฟ๐โโช๐ป ๐)ยฟ
Correct Incorrect
Incorrect
๐บ๐โโฉ๐๐
โ๐บ๐โโฉ๐๐
โ
(๐บ๐โโช๐บ๐ )โฉ(๐ ยฟยฟ๐โโช๐ป ๐)ยฟ
Correct Incorrect
Correct
![Page 19: Beyond Set Disjointness : The Communication Complexity of Finding the Intersection](https://reader035.vdocument.in/reader035/viewer/2022062520/56816024550346895dcf297a/html5/thumbnails/19.jpg)
Verification bottom-up
๐๐ โ๐
โฆ๐๐
๐บ๐๐ ,๐๐
๐ โฆ ๐บ๐๐ ,๐ ๐ข
๐๐บ๐๐ ,๐๐
๐ ๐บ๐๐ ,๐๐
๐โฆ
๐๐ โ๐
![Page 20: Beyond Set Disjointness : The Communication Complexity of Finding the Intersection](https://reader035.vdocument.in/reader035/viewer/2022062520/56816024550346895dcf297a/html5/thumbnails/20.jpg)
Analysis of Stage
โข = [node at stage computed correctly]โข Set = โ Run equality checks and basic intersection
protocols with success probability โ Key lemma: [# of restarts per leaf โ Cost of Equality = โ Cost of Intersection in leafs =
โข [protocol succeeds] =
![Page 21: Beyond Set Disjointness : The Communication Complexity of Finding the Intersection](https://reader035.vdocument.in/reader035/viewer/2022062520/56816024550346895dcf297a/html5/thumbnails/21.jpg)
Lower Bound
โข (-Intersection) = [Brody, Chakrabarti, Kondapally, Woodruff, Y.โ13]โข iff , where โข = solving independent instances of โข reduces to -Intersection:โ Given and โ Construct sets with elements and
![Page 22: Beyond Set Disjointness : The Communication Complexity of Finding the Intersection](https://reader035.vdocument.in/reader035/viewer/2022062520/56816024550346895dcf297a/html5/thumbnails/22.jpg)
Communication Direct Sums
โSolving m copies of a communication problem requires m times more communicationโ:โข For arbitrary [โฆ Braverman, Rao 10; Barak
Braverman, Chen, Rao 11, โฆ.]โข In general, canโt go beyond
![Page 23: Beyond Set Disjointness : The Communication Complexity of Finding the Intersection](https://reader035.vdocument.in/reader035/viewer/2022062520/56816024550346895dcf297a/html5/thumbnails/23.jpg)
Information cost Communication complexityโข [Bar Yossef, Jayram, Kumar,Sivakumarโ01]
Disjointness
โข Stronger direct sum for bounded-round complexity of Equality-type problems (a.k.a. โunion bound is optimalโ) [Molinaro, Woodruff, Y.โ13]
Specialized Communication Direct Sums
![Page 24: Beyond Set Disjointness : The Communication Complexity of Finding the Intersection](https://reader035.vdocument.in/reader035/viewer/2022062520/56816024550346895dcf297a/html5/thumbnails/24.jpg)
Extensions
โข Multi-party: players, , where
โ Boost error probability to โ Average per player (using coordinator):
in roundsโWorst-case per player (using a tournament) in rounds
![Page 25: Beyond Set Disjointness : The Communication Complexity of Finding the Intersection](https://reader035.vdocument.in/reader035/viewer/2022062520/56816024550346895dcf297a/html5/thumbnails/25.jpg)
Open Problems
โข (-Intersection) = โข Better protocols for the multi-party setting