a matrix expander chernoff boundnikhil/mchernoff.pdfĀ Ā· ankit garg, yin tat lee, zhao song, nikhil...
TRANSCRIPT
-
A Matrix Expander Chernoff Bound
Ankit Garg, Yin Tat Lee, Zhao Song, Nikhil Srivastava
UC Berkeley
-
Vanilla Chernoff Bound
Thm [Hoeffding, Chernoff]. If š1, ā¦ , šš are independent mean zerorandom variables with šš ā¤ 1 then
ā1
š
š
šš ā„ š ā¤ 2exp(āšš2/4)
-
Vanilla Chernoff Bound
Thm [Hoeffding, Chernoff]. If š1, ā¦ , šš are independent mean zerorandom variables with šš ā¤ 1 then
ā1
š
š
šš ā„ š ā¤ 2exp(āšš2/4)
Two Extensions:1.Dependent Random Variables2.Sums of random matrices
-
Expander Chernoff Bound [AKSā87, Gā94]
Thm[Gillmanā94]: Suppose šŗ = (š, šø) is a regular graph with transition matrix š which has second eigenvalue š. Let š: š ā ābe a function with š š£ ā¤ 1 and Ļš£ š š£ = 0.
-
Expander Chernoff Bound [AKSā87, Gā94]
Thm[Gillmanā94]: Suppose šŗ = (š, šø) is a regular graph with transition matrix š which has second eigenvalue š. Let š: š ā ābe a function with š š£ ā¤ 1 and Ļš£ š š£ = 0.Then, if š£1, ā¦ , š£š is a stationary random walk:
ā1
š
š
š(š£š) ā„ š ā¤ 2exp(āš(1 ā š)šš2)
Implies walk of length š ā 1 ā š ā1 concentrates around mean.
-
Intuition for Dependence on Spectral Gap
š = +1 š = ā1
1 ā š =1
š2
-
Intuition for Dependence on Spectral Gap
1 ā š =1
š2
Typical random walk takes Ī© š2 steps to see both Ā±1.
-
Derandomization Motivation
Thm[Gillmanā94]: Suppose šŗ = (š, šø) is a regular graph with transition matrix š which has second eigenvalue š. Let š: š ā ābe a function with š š£ ā¤ 1 and Ļš£ š š£ = 0.Then, if š£1, ā¦ , š£š is a stationary random walk:
ā1
š
š
š(š£š) ā„ š ā¤ 2exp(āš(1 ā š)šš2)
To generate š indep samples from [š] need š log š bits.If šŗ is 3 āregular with constant š, walk needs: log š + š log(3) bits.
-
Derandomization Motivation
Thm[Gillmanā94]: Suppose šŗ = (š, šø) is a regular graph with transition matrix š which has second eigenvalue š. Let š: š ā ābe a function with š š£ ā¤ 1 and Ļš£ š š£ = 0.Then, if š£1, ā¦ , š£š is a stationary random walk:
ā1
š
š
š(š£š) ā„ š ā¤ 2exp(āš(1 ā š)šš2)
To generate š indep samples from [š] need š log š bits.If šŗ is 3 āregular with constant š, walk needs: log š + š log(3) bits.
When š = š(log š) reduces randomness quadratically. Can completely derandomize in polynomial time.
-
Matrix Chernoff Bound
Thm [Rudelsonā97, Ahlswede-Winterā02, Oliveiraā08, Troppā11ā¦]. If š1, ā¦ , šš are independent mean zero random š Ć š Hermitianmatrices with | šš | ā¤ 1 then
ā1
š
š
šš ā„ š ā¤ 2šexp(āšš2/4)
-
Matrix Chernoff Bound
Thm [Rudelsonā97, Ahlswede-Winterā02, Oliveiraā08, Troppā11ā¦]. If š1, ā¦ , šš are independent mean zero random š Ć š Hermitianmatrices with | šš | ā¤ 1 then
ā1
š
š
šš ā„ š ā¤ 2šexp(āšš2/4)
Very generic bound (no independence assumptions on the entries).Many applications + martingale extensions (see Tropp).
Factor š is tight because of the diagonal case.
-
Matrix Expander Chernoff Bound?
Conj[Wigderson-Xiaoā05]: Suppose šŗ = (š, šø) is a regular graph with transition matrix š which has second eigenvalue š. Let š: š ā āšĆš be a function with | š š£ | ā¤ 1 and Ļš£ š š£ = 0.Then, if š£1, ā¦ , š£š is a stationary random walk:
ā1
š
š
š(š£š) ā„ š ā¤ 2šexp(āš(1 ā š)šš2)
Motivated by derandomized Alon-Roichman theorem.
-
Main Theorem
Thm. Suppose šŗ = (š, šø) is a regular graph with transition matrix š which has second eigenvalue š. Let š: š ā āšĆš be a function with | š š£ | ā¤ 1 and Ļš£ š š£ = 0.Then, if š£1, ā¦ , š£š is a stationary random walk:
ā1
š
š
š(š£š) ā„ š ā¤ 2šexp(āš(1 ā š)šš2)
Gives black-box derandomization of any application of matrix Chernoff
-
1. Proof of Chernoff: reduction to mgf
Thm [Hoeffding, Chernoff]. If š1, ā¦ , šš are independent mean zerorandom variables with šš ā¤ 1 then
ā1
š
š
šš ā„ š ā¤ 2exp(āšš2/4)
ā
š
šš ā„ šš ā¤ šāš”ššš¼ exp š”
š
šš = šāš”ššą·
š
š¼šš”šš
ā¤ šāš”šš 1 + š”š¼šš + š”2 š ā¤ šāš”šš+šš”
2ā¤ eā
šš2
4
-
1. Proof of Chernoff: reduction to mgf
Thm [Hoeffding, Chernoff]. If š1, ā¦ , šš are independent mean zerorandom variables with šš ā¤ 1 then
ā1
š
š
šš ā„ š ā¤ 2exp(āšš2/4)
ā
š
šš ā„ šš ā¤ šāš”ššš¼ exp š”
š
šš = šāš”ššą·
š
š¼šš”šš
ā¤ šāš”šš 1 + š”š¼šš + š”2 š ā¤ šāš”šš+šš”
2ā¤ eā
šš2
4
Markov
-
1. Proof of Chernoff: reduction to mgf
Thm [Hoeffding, Chernoff]. If š1, ā¦ , šš are independent mean zerorandom variables with šš ā¤ 1 then
ā1
š
š
šš ā„ š ā¤ 2exp(āšš2/4)
ā
š
šš ā„ šš ā¤ šāš”ššš¼ exp š”
š
šš = šāš”ššą·
š
š¼šš”šš
ā¤ šāš”šš 1 + š”š¼šš + š”2 š ā¤ šāš”šš+šš”
2ā¤ eā
šš2
4
Indep.
-
1. Proof of Chernoff: reduction to mgf
Thm [Hoeffding, Chernoff]. If š1, ā¦ , šš are independent mean zerorandom variables with šš ā¤ 1 then
ā1
š
š
šš ā„ š ā¤ 2exp(āšš2/4)
ā
š
šš ā„ šš ā¤ šāš”ššš¼ exp š”
š
šš = šāš”ššą·
š
š¼šš”šš
ā¤ šāš”šš 1 + š”š¼šš + š”2 š ā¤ šāš”šš+šš”
2ā¤ eā
šš2
4
Bounded
-
1. Proof of Chernoff: reduction to mgf
Thm [Hoeffding, Chernoff]. If š1, ā¦ , šš are independent mean zerorandom variables with šš ā¤ 1 then
ā1
š
š
šš ā„ š ā¤ 2exp(āšš2/4)
ā
š
šš ā„ šš ā¤ šāš”ššš¼ exp š”
š
šš = šāš”ššą·
š
š¼šš”šš
ā¤ šāš”šš 1 + š”š¼šš + š”2 š ā¤ šāš”šš+šš”
2ā¤ eā
šš2
4
Mean zero
-
1. Proof of Chernoff: reduction to mgf
Thm [Hoeffding, Chernoff]. If š1, ā¦ , šš are independent mean zerorandom variables with šš ā¤ 1 then
ā1
š
š
šš ā„ š ā¤ 2exp(āšš2/4)
ā
š
šš ā„ šš ā¤ šāš”ššš¼ exp š”
š
šš = šāš”ššą·
š
š¼šš”šš
ā¤ šāš”šš 1 + š”š¼šš + š”2 š ā¤ šāš”šš+šš”
2ā¤ eā
šš2
4
-
1. Proof of Chernoff: reduction to mgf
Thm [Hoeffding, Chernoff]. If š1, ā¦ , šš are independent mean zerorandom variables with šš ā¤ 1 then
ā1
š
š
šš ā„ š ā¤ 2exp(āšš2/4)
ā
š
šš ā„ šš ā¤ šāš”ššš¼ exp š”
š
šš = šāš”ššą·
š
š¼šš”šš
ā¤ šāš”šš 1 + š”š¼šš + š”2 š ā¤ šāš”šš+šš”
2ā¤ eā
šš2
4
-
2. Proof of Expander Chernoff
Goal: Show š¼exp š” Ļš š š£š ā¤ exp šššš”2
-
2. Proof of Expander Chernoff
Goal: Show š¼exp š” Ļš š š£š ā¤ exp šššš”2
Issue: š¼exp Ļš š š£š ā Ļš š¼exp(š”š(š£š))
How to control the mgf without independence?
-
Step 1: Write mgf as quadratic form
=
š0,ā¦,ššāš
ā(š£0 = š0)š š0, š1 ā¦š(ššā1, šš) exp š”
1ā¤šā¤š
š šš
=1
š
š0,ā¦,ššāš
š š0, š1 šš”š(š1)ā¦š ššā1, šš š
š”š(šš)
š¼šš” Ļšā¤š š(š£š)
= āØš¢, šøš šš¢ā© where š¢ =1
š, ā¦ ,
1
š. šš
šš
šš
šš
š š0, š1
š š1, š2
š š2, š3
šš”š(š1)
šš”š(š2)
šš”š(š3)
šø =šš”š(1)
šš”š(š)
-
Step 1: Write mgf as quadratic form
=
š0,ā¦,ššāš
ā(š£0 = š0)š š0, š1 ā¦š(ššā1, šš) exp š”
1ā¤šā¤š
š šš
=1
š
š0,ā¦,ššāš
š š0, š1 šš”š(š1)ā¦š ššā1, šš š
š”š(šš)
š¼šš” Ļšā¤š š(š£š)
= āØš¢, šøš šš¢ā© where š¢ =1
š, ā¦ ,
1
š. šš
šš
šš
šš
š š0, š1
š š1, š2
š š2, š3
šš”š(š1)
šš”š(š2)
šš”š(š3)
šø =šš”š(1)
šš”š(š)
-
Step 1: Write mgf as quadratic form
=
š0,ā¦,ššāš
ā(š£0 = š0)š š0, š1 ā¦š(ššā1, šš) exp š”
1ā¤šā¤š
š šš
=1
š
š0,ā¦,ššāš
š š0, š1 šš”š(š1)ā¦š ššā1, šš š
š”š(šš)
š¼šš” Ļšā¤š š(š£š)
= āØš¢, šøš šš¢ā© where š¢ =1
š, ā¦ ,
1
š. šš
šš
šš
šš
š š0, š1
š š1, š2
š š2, š3
šš”š(š1)
šš”š(š2)
šš”š(š3)
šø =šš”š(1)
ā¦šš”š(š)
-
Step 2: Bound quadratic form
Observe: ||š ā š½|| ā¤ š where š½ = complete graph with self loops
So for small š, should have š¢, šøš šš¢ ā āØš¢, (šøš½)šš¢ā©
Goal: Show āØš¢, šøš šš¢ā© ā¤ exp šššš”2
-
Step 2: Bound quadratic form
Observe: ||š ā š½|| ā¤ š where š½ = complete graph with self loops
So for small š, should have š¢, šøš šš¢ ā āØš¢, (šøš½)šš¢ā©
Approach 1. Use perturbation theory to show ||šøš|| ā¤ exp ššš”2
Goal: Show āØš¢, šøš šš¢ā© ā¤ exp šššš”2
-
Step 2: Bound quadratic form
Observe: ||š ā š½|| ā¤ š where š½ = complete graph with self loops
So for small š, should have š¢, šøš šš¢ ā āØš¢, (šøš½)šš¢ā©
Approach 1. Use perturbation theory to show ||šøš|| ā¤ exp ššš”2
Approach 2. (Healyā08) track projection of iterates along š¢.
Goal: Show āØš¢, šøš šš¢ā© ā¤ exp šššš”2
-
Simplest case: š = 0
šøššøššøššøšš¢
-
Simplest case: š = 0
šøššøššøššøšš¢
-
Simplest case: š = 0
šøššøššøššøšš¢
-
Simplest case: š = 0
šøššøššøššøšš¢
-
Simplest case: š = 0
šøššøššøššøšš¢
-
Observations
šøššøššøššøšš¢
Observe: š shrinks every vector orthogonal to š¢ by š.
š¢, šøš¢ =1
nĻš£āV š
š”š(š£) = 1 + š”2 by mean zero
condition.
-
A small dynamical system
š£ = šøš šš¢
š|| = āØš£, š¢ā©
šā„ = ||šā„š£||
š|| šā„
-
A small dynamical system
š£ = šøš šš¢
š|| = āØš£, š¢ā©
šā„ = ||šā„š£||
š|| šā„
0
0
šš”2
š”
š£ ā· šøšš£š = 0
-
A small dynamical system
š£ = šøš šš¢
š|| = āØš£, š¢ā©
šā„ = ||šā„š£||
š|| šā„
šš”š
šš”
šš”2
š”
š£ ā· šøšš£any š
-
A small dynamical system
š£ = šøš šš¢
š|| = āØš£, š¢ā©
šā„ = ||šā„š£||
š|| šā„
ā¤ 1
šš”
šš”2
š”
š£ ā· šøšš£š” ā¤ log 1/š
-
A small dynamical system
š£ = šøš šš¢
š|| = āØš£, š¢ā©
šā„ = ||šā„š£||
š|| šā„
ā¤ 1
šš”
šš”2
š”
Any mass that leaves gets shrunk by š
-
Analyzing dynamical system gives
Thm[Gillmanā94]: Suppose šŗ = (š, šø) is a regular graph with transition matrix š which has second eigenvalue š. Let š: š ā ābe a function with š š£ ā¤ 1 and Ļš£ š š£ = 0.Then, if š£1, ā¦ , š£š is a stationary random walk:
ā1
š
š
š(š£š) ā„ š ā¤ 2exp(āš(1 ā š)šš2)
-
Main Issue: exp š“ + šµ ā exp š“ exp(šµ) unless š“, šµ = 0
Generalization to Matrices?
Setup: š: š ā āšĆš , random walk š£1, ā¦ , š£š.Goal:
š¼šš exp š”
š
š š£š ā¤ š ā exp ššš”2
where šš“ is defined as a power series.
canāt express exp(sum) as iterated product.
-
Main Issue: exp š“ + šµ ā exp š“ exp(šµ) unless š“, šµ = 0
Generalization to Matrices?
Setup: š: š ā āšĆš , random walk š£1, ā¦ , š£š.Goal:
š¼šš exp š”
š
š š£š ā¤ š ā exp ššš”2
where šš“ is defined as a power series.
canāt express exp(sum) as iterated product.
-
The Golden-Thompson InequalityPartial Workaround [Golden-Thompsonā65]:
šš exp š“ + šµ ā¤ šš exp š“ exp šµ
Sufficient for independent case by induction.For expander case, need this for š matrices. False!
šš šš“+šµ+š¶ > 0 > šš(eAeBeC)
ā šš“
ā ššµ ā šš¶
-
The Golden-Thompson InequalityPartial Workaround [Golden-Thompsonā65]:
šš exp š“ + šµ ā¤ šš exp š“ exp šµ
Sufficient for independent case by induction.For expander case, need this for š matrices.
šš šš“+šµ+š¶ > 0 > šš(eAeBeC)
ā šš“
ā ššµ ā šš¶
-
The Golden-Thompson InequalityPartial Workaround [Golden-Thompsonā65]:
šš exp š“ + šµ ā¤ šš exp š“ exp šµ
Sufficient for independent case by induction.For expander case, need this for š matrices. False!
šš šš“+šµ+š¶ > 0 > šš(eAeBeC)
ā šš“
ā ššµ ā šš¶
-
Key Ingredient
[Sutter-Berta-Tomamichelā16] If š“1, ā¦ , š“š are Hermitian, then
log šš šš“1+ ā¦+š“š
ā¤ ā« šš½(š) log šš šš“1(1+šš)
2 ā¦šš“š(1+šš)
2 šš“1(1+šš)
2 ā¦šš“š(1+šš)
2
ā
where š½(š) is an explicit probability density on ā.
1. Matrix on RHS is always PSD.
2. Average-case inequality: šš“š/2 are conjugated by unitaries.3. Implies Liebās concavity, triple-matrix, ALT, and more.
-
Key Ingredient
[Sutter-Berta-Tomamichelā16] If š“1, ā¦ , š“š are Hermitian, then
log šš šš“1+ ā¦+š“š
ā¤ ā« šš½(š) log šš šš“1(1+šš)
2 ā¦šš“š(1+šš)
2 šš“1(1+šš)
2 ā¦šš“š(1+šš)
2
ā
where š½(š) is an explicit probability density on ā.
1. Matrix on RHS is always PSD.
2. Average-case inequality: šš“š/2 are conjugated by unitaries.3. Implies Liebās concavity, triple-matrix, ALT, and more.
-
Proof of SBT: Lie-Trotter Formula
šš“+šµ+š¶ = limšā0+
ššš“šššµššš¶1/š
log šš šš“+šµ+š¶ = limšā0+
log ||šŗ š ||2/š
For šŗ š§ ā šš§š“
2 šš§šµ
2 šš§š¶
2
-
Proof of SBT: Lie-Trotter Formula
šš“+šµ+š¶ = limšā0+
ššš“šššµššš¶1/š
log šš šš“+šµ+š¶ = limšā0+
2log ||šŗ š ||2/š/š
For šŗ š§ ā šš§š“
2 šš§šµ
2 šš§š¶
2
-
Complex Interpolation (Stein-Hirschman)
log šŗ š 2/š
š
-
Complex Interpolation (Stein-Hirschman)
|š¹ š | = šŗ š 2/š
For each š, find analytic š¹(š§) st:
|š¹ 1 + šš” | ā¤ šŗ 1 + šš” 2
|š¹ šš” | = 1
-
Complex Interpolation (Stein-Hirschman)
|š¹ š | = šŗ š 2/š
For each š, find analytic š¹(š§) st:
|š¹ 1 + šš” | ā¤ šŗ 1 + šš” 2
|š¹ šš” | = 1
log š¹ š ā¤ ą¶±log |š¹ šš” | + ą¶±log |š¹ 1 + šš” |
-
Complex Interpolation (Stein-Hirschman)
|š¹ š | = šŗ š 2/š
For each š, find analytic š¹(š§) st:
|š¹ 1 + šš” | ā¤ šŗ 1 + šš” 2
|š¹ šš” | = 1
log š¹ š ā¤ ą¶±log |š¹ šš” | + ą¶±log |š¹ 1 + šš” |
lim š ā 0
-
Key Ingredient
[Sutter-Berta-Tomamichelā16] If š“1, ā¦ , š“š are Hermitian, then
log šš šš“1+ ā¦+š“š
ā¤ ā« šš½(š) log šš šš“1(1+šš)
2 ā¦šš“š(1+šš)
2 šš“1(1+šš)
2 ā¦šš“š(1+šš)
2
ā
where š½(š) is an explicit probability density on ā.
-
Key Ingredient
[Sutter-Berta-Tomamichelā16] If š“1, ā¦ , š“š are Hermitian, then
log šš šš“1+ ā¦+š“š
ā¤ ā« šš½(š) log šš šš“1(1+šš)
2 ā¦šš“š(1+šš)
2 šš“1(1+šš)
2 ā¦šš“š(1+šš)
2
ā
where š½(š) is an explicit probability density on ā.
Issue. SBT involves integration over unbounded region, bad for Taylor expansion.
-
Bounded Modification of SBT
Solution. Prove bounded version of SBT by replacing stripwith half-disk.
[Thm] If š“1, ā¦ , š“š are Hermitian, then
log šš šš“1+ā¦+š“š
ā¤ ā« šš½(š) log šš šš“1š
šš
2 ā¦šš“šš
šš
2 šš“1š
šš
2 ā¦šš“šš
šš
2
ā
where š½(š) is an explicit probability density on
āš
2,š
2.
š
Proof. Analytic š¹(š§) + Poisson Kernel + Riemann map.
-
Handling Two-sided Products
Issue. Two-sided rather than one-sided products:
šš šš”š š£1 š
šš
2 ā¦šš”š š£š š
šš
2 šš”š š£1 š
šš
2 ā¦šš”š š£š š
šš
2
ā
Solution.Encode as one-sided product by using šš š“ššµ = š“āšµš š£šš š :
āØšš”š š£1 š
šš
2 ā šš”š š£1
āššš
2 ā¦šš”š š£š
āššāšš
2 āšš”š š£š
āššāšš
2 vec Id , vec Id ā©
-
Handling Two-sided Products
Issue. Two-sided rather than one-sided products:
šš šš”š š£1 š
šš
2 ā¦šš”š š£š š
šš
2 šš”š š£1 š
šš
2 ā¦šš”š š£š š
šš
2
ā
Solution.Encode as one-sided product by using šš š“ššµ = š“āšµš š£šš š :
āØšš”š š£1 š
šš
2 āšš”š š£1
āšššš
2 ā¦šš”š š£š
āššāšš
2 āšš”š š£š
āššāšš
2 vec Id , vec Id ā©
-
Finishing the Proof
Carry out a version of Healyās argument with š ā š¼š2 and:
And š£šš(š¼š) ā š¢ instead of š¢.
This leads to the additional š factor.
šš
šš
šš
šš
š š0, š1
š š1, š2
š š2, š3
šø =šš”š 1 ššš
2 āšš”š 1 āšššš
2
ā¦
šš”š(š)ššš
2 āšš”š š āššāšš
2
-
Main Theorem
Thm. Suppose šŗ = (š, šø) is a regular graph with transition matrix š which has second eigenvalue š. Let š: š ā āšĆš be a function with | š š£ | ā¤ 1 and Ļš£ š š£ = 0.Then, if š£1, ā¦ , š£š is a stationary random walk:
ā1
š
š
š(š£š) ā„ š ā¤ 2šexp(āš(1 ā š)šš2)
-
Open Questions
Other matrix concentration inequalities (multiplicative, low-rank, moments)
Other Banach spaces(Schatten norms)
More applications of complex interpolation