narfinger.github.io · 1 revisitinglowerboundsformulti-r-icdepth 2 fourcircuits 3 suryajith...

Revisiting Lower Bounds for Multi-r-ic Depth1

Four Circuits2

Suryajith Chillara3

IIT Bombay, India4

[email protected]

Christian Engels6

IIT Bombay, India7

[email protected]

Abstract9

Multi-r-ic arithmetic circuits are arithmetic circuits where the individual degree of every variable, in10

polynomials computed at every node, is restricted to be at most r. This is a natural generalization11

of the multilinear restriction.12

Raz and Yehudayoff (CC, 2008) introduced a "full rank polynomial" and proved strong lower13

bounds against the multilinear circuits (of small depth) and formulas computing it, using the method14

of partial derivatives (cf. Nisan-Wigderson (CC, 1997), Raz (ToC 2006)). It is a natural question to15

ask if these techniques and polynomial constructions be extended beyond the multilinear setting, to16

multi-r-ic circuit models and thus prove lower bounds against multi-r-ic circuits and in particular,17

depth four multi-r-ic circuits.18

In this paper, we prove a superpolynomial yet quasipolynomial lower bound on the size of depth19

four multi-r-ic circuits computing an explicit "full rank polynomial" by just using the method of20

partial derivatives.21

Recently Kayal, Saha and Tavenas (ToC 2018) proved an exponential lower bound on the size of22

depth four multi-r-ic circuits computing explicit polynomials using a stronger complexity measure.23

Our proof however borrows no elements from theirs and is for a completely different polynomial.24

The proof strategy is inspired by Saptharishi’s proof of an exponential lower bound for depth three25

multi-r-ic circuits (cf. Chapter 14, Ramprasad’s survey, 2019).26

The point of this paper is to retrospectively extend our understanding of the power of the method27

of Partial Derivatives and thus achieve superpolynomial lower bounds, and its limits to realize the28

need to use stronger complexity measures to prove exponential bounds.29

2012 ACM Subject Classification Theory of Computation → Algebraic Complexity Theory30

Keywords and phrases Algebraic Complexity, Lower Bounds31

Digital Object Identifier 10.4230/LIPIcs...32

1 Introduction33

One of the major focal points in the area of algebraic complexity theory is to show that34

certain polynomials are hard to compute syntactically. Here, the hardness of computation is35

quantified by the number of arithmetic operations that are needed to compute the target36

polynomial. Instead of the standard turing machine model, we consider arithmetic circuits37

and formulas as models of computation.38

Arithmetic circuits are directed acyclic graphs such that the leaf nodes are labeled by39

variables or constants from the underlying field, and every non-leaf node is labeled either by40

a + or ×. Every node computes a polynomial by operating on its inputs with the operand41

given by its label. The flow of computation flows from the leaf to the output node. We refer42

the readers to the standard resources [27, 26] for more information on arithmetic formulas43

and arithmetic circuits.44

© Suryajith Chillara and Christian Engels;licensed under Creative Commons License CC-BY

Leibniz International Proceedings in InformaticsSchloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl Publishing, Germany

mailto:[email protected]

mailto:[email protected]

https://doi.org/10.4230/LIPIcs...

https://creativecommons.org/licenses/by/3.0/

https://www.dagstuhl.de/lipics/

https://www.dagstuhl.de

XX:2 Revisiting Lower Bounds for Multi-r-ic Depth Four Circuits

Valiant conjectured that the Permanent does not have polynomial sized arithmetic45

circuits [30]. Working towards that conjecture, we aim to prove superpolynomial circuit46

size lower bounds. However, the best known circuit size lower bound is Ω(n logn), for a47

power symmetric polynomial, due to Baur and Strassen [28, 3], and, the best known formula48

size lower bound is Ω(n2), due to Kalorkoti [15]. Due to the slow progress towards proving49

general circuit/formula lower bounds, it is natural to study some restricted class of arithmetic50

circuits and formulas.51

Since most of the polynomials of interest such as Determinant, Permanent, etc., are52

multilinear polynomials, it is natural to consider the restriction where every intermediate53

computation is in fact multilinear. Due to the phenomenal work in the last two decades [20,54

22, 21, 24, 25, 13, 23, 2, 7, 5, 6], the complexity of multilinear formulas and circuits is better55

understood than that of general formulas and circuits.56

Backed with this progress it is natural to try to extend these results to a circuit model57

where the individual degree of every variable in the polynomial computed at every node in58

the circuit is r. We refer to these circuits as multi-r-ic circuits. When r = 1, the circuit59

model is multilinear.60

Recently, Kumar, Oliviera and Saptharishi [18] showed that there is a chasm1 for61

multi-r-ic circuits too. They proved that any polynomial sized (say nc) multi-r-ic circuit of62

arbitrary depth computing a polynomial on n variables can be depth reduced to syntactical63

multi-r-ic depth four circuits of size exp(O(√n logn)). This provides us a motivation to64

study multi-r-ic depth four circuits and prove strong lower bounds against them. Towards65

this, Kayal, Saha and Tavenas [17] proved an exponential size lower bound against multi-r-ic66

depth four circuits computing a variant of the iterated matrix multiplication polynomial.67

They achieved this bound using a variant of the method Shifted Partial Derivatives [11]68

and the method of Skew Partial Derivatives [16] called the method of Shifted Skew Partial69

Derivatives.70

Motivation for this work and our results:71

For non-homogeneous multilinear circuit lower bounds (of small depth), Raz and Yehuday-72

off [24] introduced a "full rank polynomial" and proved lower bounds against the multilinear73

circuits of small depth and multilinear formulas computing it through the method of partial74

derivatives. It is a natural question to ask if such techniques can be extended and thus used75

against multi-r-ic circuits and in particular, depth four multi-r-ic circuits.76

We tread this line of thought and show that we can indeed prove super polynomial size77

lower bounds against depth four multi-r-ic circuits computing a "full rank polynomial" by78

just using the method of partial derivatives.79

The premise of our work is set in a hypothetical universe that is blissfully unaware of80

the result of Kayal, Saha and Tavenas [17] and then work towards proving lower bounds81

for multi-r-ic circuits. For this reason, our proof borrows no elements from [17] and is only82

based on tools and techniques developed prior to it.83

We first construct a "full rank polynomial" (based on the construction of Saptharishi [26,84

Chapter 14]) and then show that any depth four multi-r-ic circuit computing it must have a85

super polynomial size using the dimension of partial derivatives as a complexity measure.86

This can be formally stated as follows.87

1 Agrawal and Vinay [1], Koiran, and Tavenas [29] showed that any general circuit can be depth reducedto a depth four circuit of non-trivial size.

S. Chillara and C. Engels XX:3

I Theorem 1. Let n, r be positive integers such that r = o(√

logn). There exists an explicit88

polynomial on n-variables and degree O(n) such that any syntactically multi-r-ic depth four89

circuit computing it must be of size nΩ( lognr2 ).90

Comparison to related work:91

It is important to note that our lower bound is quasipolynomial in the number of variables92

whereas the bound of Kayal, Saha and Tavenas [17] is exponential. Our result deviates from93

that of [17] in the following ways:94

1. Our proof technique is entirely different from that of [17] and is of independent interest95

even in the presence of their quantitatively better bound.96

2. We prove a super polynomial size lower bound against depth four multi-r-ic circuits for a97

new multi-r-ic full rank polynomial.98

3. The polynomials considered by Kayal et al. have an implicit coding theoretic structure99

on a large set of their monomials making them amenable to the method of Shifted Skew100

Partial Derivatives. This structure is not so evident to us, in the full rank polynomial101

that we consider and we do not know if we can prove exponential bounds for the full rank102

polynomial using Shifted Skew Partial Derivatives.103

4. The polynomials considered by Kayal et al. are not "full rank polynomials" and just the104

method of partial derivatives for those polynomials does not yield much in this multi-r-ic105

setting when r is not too small.106

Proof overview:107

Analogous to the work of Fournier et al. [9] and [19], we first consider multi-r-ic depth four108

circuits of low bottom support2 and prove lower bounds against them. Let T1, T2, . . . , Ts be109

the terms corresponding to the product gates feeding into the output sum gate. The output110

polynomial is obtained by adding the terms T1, T2, . . . , Ts. Recall that each of these Ti’s is a111

product of low support polynomials Qi,j , that is, every monomial in these Qi,j ’s is supported112

on a small set of variables (say µ many).113

Let ρ : X 7→ Y t Z be a partition of variables X into two disjoint sets Y and Z such114

that each variable in X is either mapped to Y or Z with equal probability. Under such a115

partition, we can construct a matrix M(f |ρ) such that the rows are indexed by monomials116

in Y -variables and the columns are indexed by monomials in Z-variables. The entries of the117

matrix are defined by the coefficient of the monomial obtained by the product of its indices,118

in f |ρ (cf. Section 2).119

As in [22, 24], we show that under a random partition ρ, for all i, M(Ti|ρ) has far from120

full rank with a high probability.121

Let us first consider the following two extremal cases:122

1. Each of the Qi,j ’s is dependent on a small set of variables: Recall that rank of a matrix123

is sub-multiplicative (cf. Observation 2 Item 1) and thus we get that rank(M(Ti|ρ)) ≤124 ∏j rank(M(Qi,j |ρ)). We first observe that a factor Qi,j does not contribute non-trivially125

to the rank if all its variables are either set only to Y -variables or only to Z-variables.126

In such a case rank(M(Qi,j |ρ)) = 1. It is easy to show that this event happens with127

a decent probability. We call such a factor ineffective. In particular, when a factor128

becomes ineffective, all the variable appearances in that factor do not contribute to129

2 That is, all the product gates at the bottom are supported on small set of variables.


the rank of M(Ti|ρ). We show that under a random partition, a good fraction of the130

variable appearances3 are rendered ineffective with high probability. Conditioned on this131

probability, we then measure the contribution of the remaining variable appearances132

to the matrix rank by estimating the number of possibly non-zero rows in the matrix133

M(Ti|ρ). We show that this is smaller than maximum possible rank. The proof of this134

case borrows ideas from Saptharishi’s proof [26, Chapter 14] of an exponential lower135

bound against depth three multi-r-ic circuits.136

2. Each of the Qi,j ’s is dependent on a large set of variables: Since each of the Qi,j ’s137

is dependent on a large set of variables (say τ many) and total number of variable138

appearances is bounded by nr, we can infer that there cannot be more than nr/τ many139

factors. Since Qi,j ’s are polynomials with bottom support bounded by at most ≤ µ,140

deg(Qi,j) is at most µr and thus the degree of Ti is at most µnr2/τ . For a carefully chosen141

parameter τ , this can be small. From this, it is now clear that the possibly non-zero142

rows of the matrix M(Ti|ρ) will be indexed by monomials of degree at most µnr2/τ in Y143

variables.144

An arbitrary term Ti under consideration could have both types of factors, that is factors145

with low variable support and factors with large variable support. To deal with this, we can146

break our term Ti into two terms Ti,1 and Ti,2 such that T1 = Ti,1 ·Ti,2. Here, Ti,1 has factors147

of small support and Ti,2 has factors of large support. We then obtain rank of M(Ti,1|ρ) and148

M(Ti,2|ρ) separately using the aforementioned arguments and then obtain an upper bound149

rank of M(Ti|ρ) by multiplying ranks of M(Ti,1|ρ) and M(Ti,2|ρ) (Item 1 of Observation 2).150

Once we obtain the rank ofM(Ti|ρ), using sub-additivity of rank (Item 2 of Observation 2)151

and an union bound, we then show that the polynomial computed by a "small sized" depth four152

multi-r-ic circuit of low bottom support, has less than full rank for a random partition, with153

a non-zero probability. On the other hand, we show that an explicit candidate polynomial Fn154

has full rank under every partition of the variables. Hence, a depth four multi-r-ic circuit of155

low bottom support, of small size, cannot compute Fn. This yields us a lower bound against156

multi-r-ic depth four circuits of low bottom support.157

Using standard techniques (cf. [26, Chapter 20]), we can then lift our lower bound against158

depth four multi-r-ic circuits of low bottom support to general depth four multi-r-ic circuits.159

2 Preliminaries160

Depth four circuits:161

A depth four circuit (denoted by ΣΠΣΠ) over a field F and variables x1, x2, . . . , xn computes162

polynomials which can be expressed in the form of sums of products of polynomials. That163

is,s∑i=1

di∏j=1

Qi,j(x1, . . . , xn) for some di’s. A depth four circuit of bottom support t (denoted164

by ΣΠΣΠt) is a depth four circuit where all the monomials in every polynomial Qi,j are165

supported on at most t variables.166

Multi-r-ic arithmetic circuits:167

We refer the readers to the standard resources [27, 26] for the definitions of arithmetic168

formulas and arithmetic circuits.169

3 Recall that any variable can appear at most r times among all the factors.


I Definition 2 (multi-r-ic circuits). Let r = (r1, r2, · · · , rn).170

1. A polynomial f(x1, x2, · · · , xn) is said to be multi-r-ic if for all i ∈ [n], the individual171

degree of the variable xi in the polynomial computed at every node is at most ri. If172

r1 = r2 = · · · = rn = r, we simply refer to it as a multi-r-ic polynomial.173

2. An arithmetic circuit Φ is said to be a multi-r-ic circuit if the polynomial fu computed a174

node u ∈ Φ is a multi-r-ic polynomial for all u ∈ Φ.175

3. An arithmetic circuit Φ is said to be a syntactically multi-r-ic circuit if for all product176

gates u ∈ Φ and u = u1 × u2 × · · · × ut, the total degree with respect to every variable is177

bounded by r, i.e.,∑j∈[t] degxi(fuj ) ≤ r for all i ∈ [n].178

Complexity measure:179

In this article, we use the measure of the dimension of partial derivative space as the180

complexity measure to prove lower bounds (cf. [20, 22]). We follow the coefficient matrix181

perspective of Raz.182

Let ρ : X 7→ Y t Z be a partitioning function of X variables. Any monomial m in the183

polynomial f over the variables X, upon the application of the partitioning function can be184

expressed as a product of a monomial in Y variables with a monomial in Z variables.185

f(X) =∑i∈|M|

ci ·mi(X) 7→ f |ρ(Y, Z) =∑i∈|M|

ci · mi(Y ) · mi(Z)186

187

Let M |ρ(f |ρ) be a matrix whose rows are indexed by all the monomials in Y variables and188

the columns are indexed by all the monomials in Z variables. The entry indexed by the189

monomials (m(Y ), m(Z)) in M |ρ(f |ρ) is the coefficient of the monomial m(Y ) · m(Z) in f |ρ.190

The complexity of f under ρ is the rank of the matrix Mρ(f |ρ). In short, we shall denote it191

as Γρ(f |ρ).192

I Observation 1. For any multi-r-ic polynomial f(X) and a partitioning function ρ : X 7→193

Y t Z, Γρ(f |ρ) is at most (r + 1)min|Y |,|Z|.194

I Observation 2. Let f1 and f2 be two polynomials defined over the sets of variables X. Let195

ρ : X 7→ Y t Z be a partitioning function.196

1. Sub-multiplicativity: If f(X) = f1(X) · f2(X) then Γρ(f |ρ) ≤ Γρ(f |ρ) · Γρ(f |ρ).197

2. Sub-additivity: If f(X) = f1(X) + f2(X) then Γρ(f |ρ) ≤ Γρ(f |ρ) + Γρ(f |ρ).198

I Observation 3. Let f1 and f2 be two polynomials defined over the disjoint sets of variables199

X1 and X2. Let ρ1 : X1 7→ Y1 tZ1 and ρ2 : X2 7→ Y2 tZ2 be two partitioning functions such200

that Y1, Y2, Z1, Z2 are all distinct. Let f(X1 ∪X2) = f1(X1) · f2(X2) and ρ = ρ1 ⊕ ρ2. Then201

Mρ(f |ρ) = Mρ1(f1|ρ1)⊗Mρ2(f2|ρ2) and thus, Γρ(f |ρ) = Γρ1(f |ρ1) · Γρ2(f |ρ2).202

Useful probabilistic estimates:203

The following version of the Chernoff bound [4, 12] is from the book of Dubashi and204

Panconesi [8, Theorem 1.1].205

I Theorem 3 (Chernoff bound). Let W1, . . . ,Wn be independent 0, 1-valued random vari-206

ables and let W =∑i∈[n]Wi. Then we have the following.207

1. For any ε > 0,208

Pr [W > (1 + ε)E [W ]] ≤ exp(−ε2

3 E [W ]) and Pr [W < (1− ε)E [W ]] ≤ exp(−ε2

2 E [W ]).209


2. For any t > 2eE [W ],210

Pr [W > t] ≤ 2−t.211

We also need the following "read-r" version of concentration bounds.212

I Theorem 4 ([10, 14]). Let X1, . . . , Xn be independent random variables. Let E1, . . . , Ek213

be boolean random variables that are functions of Xi such that each Xi influences at most214

r of the Ej’s. If Pr[Ei = 1] ≥ p, then for any ε > 0, we have215

Pr [E1 + · · ·+ Ek ≤ (p− ε)k] ≤ e−2ε2k/r.216

3 Low-support multi-r-ic depth 4 circuits217

In this section we will show that any polynomial that is computed by a syntactically multi-r-ic218

depth four circuit of low bottom support must have strictly less than full rank for at least219

one partition of the variables.220

I Theorem 5. Let n, r > 1, µ be positive integers. Let T be a syntactic multi-r-ic product of221

polynomials Q1Q2 · · ·Qt over the variables X = x1, x2, . . . , xn. Let ρ be a random partition222

of variables X into Y t Z with equal probability. Let m = min |Y | , |Z|. Then, Γρ(T |ρ) is223

at most (1 + r)m−n

240µr2+2 with a probability of at least 1− 2 exp(

n280µr2+1

).224

Proof. Since, ρ is a random partition of variables X into Y t Z with equal probability,225

Eρ [|Y |] = |X| /2 = n/2. Let nY denote |Y |. Without loss of generality, let us assume that Y226

is the smaller partition4. Using Chernoff bound (from Theorem 3), we get that nY ≥ n/4227

with a probability of at least 1− exp(− n

16). From here on, we shall condition the rest of the228

analysis on nY ≥ n/4 and use this error probability to union bound all the error probabilities229

at the end.230

Let us further express T as a product of T1 and T2 where231

T1 = Qi | |Supp(Qi)| < τ and T2 = Qi | |Supp(Qi)| ≥ τ232

for a parameter τ fixed5 to 10µnr2

nY. Let a = (a1, a2, . . . , an) be a n-length vector that233

represents the maximum degree of all the variables in T1 and b = (b1, b2, . . . , bn) be a234

n-length vector that represents the maximum degree of all the variables in T2. Since the235

circuit C is multi-r-ic , we get that a + b ≤ r. Further, for all i ∈ [n], ai + bi ≤ r.236

From sub-multiplicativity of rank, we get that Γρ(T |ρ) ≤ Γρ(T1|ρ) · Γρ(T2|ρ). We shall237

now compute Γρ(T1|ρ) and Γρ(T2|ρ) independently and thus obtain a suitable upper bound238

on Γρ(T |ρ).239

Computing Γρ(T1|ρ):240

From the definition, it is clear that the support of all the polynomials in T1 is at most τ .241

With some reordering of indexes, we can consider T1 to be Q1Q2 · · ·Qt1 . For all i ∈ [t1], let242

the polynomial Qi be defined over a variable sets Xi. Let w1, w2, . . . , w` be an enumeration243

of all the appearances of the X variables in T1 with repetition.244

4 If not, interchange the labels of the sets Y and Z.5 Note that technically, τ is a random variable as nY is a random variable.


We call a factor Qi (i ∈ [t1]) ineffective under ρ if all the variables in the support of Qi245

are either all mapped to Y variables or to Z variables. We call a variable wj ineffective under246

ρ if it appears in a factor and if it occurs in an ineffective factor. Let Ej be a 0, 1-random247

variable such that Ej = 1 if wj is ineffective and 0 otherwise.248

Prρ [Ej = 1] = Prρ [Qj is ineffective under ρ] = 12τ−1 .249

Notice, however, that Ei, Ej are not independent.250

Let us now count the total number of ineffective appearances of Y variables. Note that251

Ej ’s are at most r-wise dependent. This is due to the fact that any variable xi appears in at252

most r many Qj ’s and thus under ρ, it can influence at most r-many Ej ’s. Let S be a subset253

of [`] such that j ∈ S iff the X variable corresponding to wj maps to a Y variable under ρ.254

Let rS =∑i:ρ(xi)∈Y ai. Using Theorem 4 with the value of ε set to 2−τ , we get that255

Prρ

∑j∈S

Ej ≤rS2τ

≤ exp(− rSr · 22τ−1

).256

With probability of(1− exp

(− rSr·22τ−1

)), at most rS

(1− 1

2τ)many variable appear-257

ances are not ineffective. Let d be a vector of length n that represents the effective258

variable appearances. From the aforementioned arguments, we get that with high probability259 ∑i:ρ(xi)∈Y di ≤ rS

(1− 1

2τ). The number of possible non-zero rows in MY,Z(T1) is at most260 ∏

i:ρ(xi)∈Y (1 + di). Remember that nY = |Y |.261

Γρ(T1|ρ) ≤∏

i:ρ(xi)∈Y

(1 + di)262

≤

(nY +

∑i:ρ(xi)∈Y di

nY

)nY(Using AM-GM inequality)263

≤

(1 +

rS(1− 1

2τ)

nY

)nY264

≤ (1 + r)rSr (1− 1

2τ ) . (Using (1 + ux)n ≤ (1 + x)un)265266

Rephrasing this, we can say that Γρ(T1|ρ) is at most (1 + r)rSr (1−ν) where ν is at least267

1/2τ with a probability of(1− exp

(− rSr·22τ−1

)).268

Computing Γρ(T2|ρ):269

Under suitable renaming of polynomials, let T2 be the product R1R2 · · ·Rt2 . Let dL be equal270

to∑i∈[n] bi, rL be equal to

∑i:ρ(xi)∈Y bi. Note that dL ≤ nr.271

Trivially, the number of possible non-zero rows inMY,Z(T2|ρ) is at most∏i:ρ(xi)∈Y (1+bi).272

Γρ(T2|ρ) ≤∏

i:ρ(xi)∈Y

(1 + bi) ≤(nY +

∑i:ρ(xi)∈Y bi

nY

)nY≤ (1 + r)

rLr .273

274

We shall now show that this bound can be improved if there is a significant chunk of Y275

variables that appear in T2. From the definition, we know that for all i ∈ [t2], the polynomial276


Ri has variable support of at least τ and every monomial depends only on µ variables6. This277

directly leads us to the following observation.278

I Observation 4. For all i ∈ [t2], degree of the polynomial Ri is at most µr. Thus, the279

degree of T2 is at most µrt2.280

Note that dL counts the aggregate of the individual degrees of all the variables in T2 and the281

support size of each Ri is at least τ . From this, we can infer that there cannot be too many282

factors.283

I Observation 5. The total number of factors in T2 is at most dL/τ ≤ nr/τ . Thus, degree284

of T2 is at most µnr2

τ .285

Thus, the number of potential non-zero rows in MY,Z(T2|ρ) will be indexed by all286

monomials in Y of degree at most µrt2 ≤ µnr2/τ . since τ = 10µnr2

nY, we have that nY ≥287

µnr2

τ = nY10 .288

Γρ(T2|ρ) ≤(nY + µnr2

τ

nY

)289

=(nY + µnr2

τµnr2

τ

)290

≤

(nY + µnr2

τµnr2

τ

)µnr2τ

· eµnr2τ (Using

(n

k

)≤ (en/k)k for k ≤ n/2)291

=(

1 + nY τ

µnr2

)µnr2τ

· eµnr2τ292

≤ (1 + r)nYr + µnr2

τ ln(1+r) . (Using (1 + ux)n ≤ (1 + x)un)293294

We will now bound Γρ(T2|ρ) by (1 + r)rLr (1−η) where η = max

0, 1− r

rL

(nYr + µnr2

τ ln(1+r)

).295

Computing Γρ(T |ρ):296

From the sub-multiplicativity of the measure, we know that Γρ(T |ρ) is at most Γρ(T1|ρ) ·297

Γρ(T2|ρ).298

Γρ(T |ρ) ≤ (1 + r)rSr (1−ν) · (1 + r)

rLr (1−η)

299

= (1 + r)(rL+rS

r − rLηr −rSν

r )300

≤ (1 + r)(nY + rL+rSr −(nY + rLη

r + rSν

r ))301

= (1 + r)nY −∆302303

where ∆ =(nY + rLη

r + rSνr

)− rL+rS

r . Note that the maximum value Γρ(T |ρ) can attain is304

(1 + r)nY and ∆ is a measure of rank deficiency under ρ.305

Recall that rL + rS ≤ nY r and hence ∆ ≥ 0. Let rL ≤ δnY r and rS = (1 − δ)nY r306

for some δ = δ(τ) ∈ [0, 1]. Note that the value of δ gets fixed given a summand T and a307

threshold parameter τ .308

6 Recall that each of the polynomials R1, R2, . . . , Rt2 are such that every monomial has a support of atmost µ.


Case when δ ≤ 0.75: Since δ is small, the contribution from T2 towards ∆ might be small309

or nothing at all. In this case, we shall argue that ∆ can be suitably lower bounded by just310

using the contributions from T1 with high probability.311

∆ =(nY + rLη

r+ rSν

r

)− rL + rS

r312

≥ nY −rLr− rS

r(1− ν) (Since η ≥ 0)313

≥ nY − δnY − (1− δ)nY (1− ν) (Since rL ≤ δnY r and rS = (1− δ)nY r)314

= nY (1− δ)ν.315316

With a probability of at least(1− exp

(− rSr·22τ−1

))≥(1− exp

(− nY

22τ+1

)), ν is at least317

12τ . Conditioned on this probability, we get that ∆ ≥ nY

2τ+2 .318

Case when δ > 0.75: Since δ is not too small, the contribution from T2 towards ∆ will319

start getting significantly larger. In this case, we shall argue that ∆ can be suitably lower320

bounded by just using the contributions from T2. By ignoring the contributions from T1 to321

∆, we can get rid of probabilistic arguments in this case.322

Firstly, we will argue that if rL+rSr is much smaller than nY , then ∆ is non-trivially lower323

bounded. Let us suppose that rL+rSr ≤ (1− ε)nY for a constant ε fixed to a value of 0.1. In324

such a case,325

∆ =(nY + rLη

r+ rSν

r

)− rL + rS

r≥(nY −

rL + rSr

)≥ εnY = nY

10 .326

327

Hereafter, we can assume that rL+rSr > (1− ε)nY and thus, rL ≥ (δ − ε)nY r.328

∆ =(nY + rLη

r+ rSν

r

)− rL + rS

r329

≥(nY −

rS + rLr

)+ rL

r− nY

r− µnr2

τ ln(1 + r)330

≥ (δ − ε)nY −nYr− µnr2

τ ln(1 + r) (Since nY ≥rS + rL

r)331

≥(δ − ε− 1

r

)nY −

nY10 ln(1 + r) (Since τ = 10µnr2/nY )332

≥(δ − ε− 1

r− 1

10 ln(1 + r)

)nY333

≥ (δ − 0.7)nY (Since ∀ r > 1, 1r

+ 110 ln(1 + r) < 0.6)334

≥ 0.05nY .335336

Putting it all together:337

From the above mentioned case analysis, for large values of τ we can infer that ∆ ≥338

min

0.1nY , 0.05nY , nY2τ+2

≥ nY

2τ+2 with a probability of at least 1 − exp(− nY

22τ+1

). Recall339

that nY ≥ n4 with a probability of 1− exp

(− n

16). Conditioned on this, τ is at most 40µr2.340

Thus, ∆ ≥ n240µr2+2 with a probability of at least 1 −

(exp

(− n

16)

+ exp(− n

280µr2+1

))≥341 (

1− 2 exp(− n

280µr2+1

)).342


This gives us the theorem statement that Γρ(T |ρ) is at most (1 + r)min|Y |,|Z|− m

240µr2+2343

with a probability of at least 1− 2 exp(− n

280µr2+1

). J344

I Theorem 6. Let n, r, µ be positive integers. Let C be a syntactically multi-r-ic depth four345

ΣΠΣΠµ circuit of size s < exp(

n280µr2+2

)that computes a multi-r-ic polynomial f over346

the variables X = x1, x2, · · · , xn. Then there exists a random partition ρ of variables X347

into Y t Z with equal probability such that Γρ(f |ρ) is strictly less than (1 + r)min|Y |,|Z|.348

Proof. Recall that f is a sum of at most s many products of polynomials T (1) + · · · +349

T (s). Using sub-additivity of measure, we get that Γρ(f |ρ) ≤ s · maxi∈[s] Γρ(T (i)|ρ) . Let350

m = min |Y | , |Z|. From Theorem 5, we get that for a term T (i), Γρ(T (i)|ρ) is at most351

(1 + r)m−n

240µr2+2 with a probability of at least 1− 2 exp(− n

280µr2+1

). Taking a union bound352

over all the T (i)’s, we get that for all i ∈ [s], Γρ(T (i)|ρ) is is at most (1 + r)m−n

240µr2+2 with353

a probability of at least p = 1− 2s exp(− n

280µr2+1

). Since s < exp

(n

280µr2+2

), p is strictly354

greater than zero.355

Thus with a non-zero probability, we get that Γρ(f |ρ) is strictly less than (1 + r)m.356

Γρ(f |ρ) ≤ s ·maxi∈[s]

Γρ(T (i)|ρ)

357

≤ s · (1 + r)m−n

240µr2+2358

< (1 + r)m− n

240µr2+2+ n

280µr2+2 log(1+r)359

< (1 + r)m .360361

J362

4 Multi-r-ic depth 4 circuits363

Before we describe the proof of the lower bound, we shall introduce our polynomial family. To364

begin with, we consider a polynomial which is a slight generalization of a full rank polynomial365

of Raz and Yehudayoff [24]7. Using this polynomial, we construct the hard polynomial F by366

composing the full rank polynomial with linear forms akin to [26].367

The full rank polynomial Fn is defined over the field368

F(wa,b,c, wa,b : a, b, c ≤ 2n)[x1, x2, . . . , xn]. In short, we denote it as F(W )[X].369

Notice, that wa,b,c and wa,b are some auxiliary variables here. The full rank polynomial370

Fn = f1,2n is defined recursively as follows.371

fi,j =

r∑k=0

xki xki+1 if j = i+ 1,(

r∑k=0

xki xkj

)· fi+1,j−1 · wi,j

+j−1∑`=i+1

fi,` · f`+1,j · wi,`,j

if j − i is odd,

0 otherwise.

372

7 This was first defined and was then used by Saptharishi to prove a multi-r-ic depth three lower bound [26]


We can now define our hard polynomial Pn based on the definition of Fn. Let p = n−c373

for some constant c ∈ (0, 1) which we shall later fix. Let X = x1, . . . , x2n and X =374

x1,1, x1,2, . . . , x1,t, . . . , x2n,1, x2n,2, . . . , x2n,t be distinct variable sets, where t = 2n log (2n)p .375

Let Linp : X 7→ X be a linear map such that xi 7→∑tj=1 xi,j for all i ∈ [2n]. The polynomial376

Pn,p(X) be defined to be Fn Linp(X). That is,377

Pn,p = Fn(t∑

j=1x1,j ,

t∑j=1

x2,j , · · · ,t∑

j=1x2n,j) where t = 2n log (2n)

p.378

379

We will now recall the following useful trick due to Kumar and Saptharishi [26].380

I Lemma 7 (Lemma 20.5 [26]). Let σ be a random restriction on the variable set X that sets381

a variable to zero with a probability of (1− p) where p = n−c for some constant c ∈ (0, 1) .382

Then Fn is a projection of (Pn,p)|σ with a probability of at least (1− 2−n).383

We will now show that the matrix corresponding to the polynomial Fn under every384

partition that we consider is of full rank and hence it is a hard polynomial for the circuit385

model under consideration.386

I Lemma 8. The polynomial Fn = f1,2n defined above has the property that for every387

partition X = x1, . . . , x2n = Y t Z, we have388

Γρ(Fn|ρ) ≥ (r + 1)min(|Y |,|Z|).389

Further, this polynomial can be computed by a linear sized arithmetic circuit and hence is in390

VP.391

Proof. The proof is similar to Lemma 13.17 [26]. We use an induction over n, where 2n is392

the number of variables. For n = 1 we have the polynomial F1 =∑rk=0 x

k1x

k2 . It is clear that393

any partition of x1 and x2 will have rank of (r + 1)min(|Y |,|Z|). That is, Γρ(F1) = 1 if both394

the variables are on the same side, and is equal to r + 1 if there is an equi-partition.395

Let n > 1 and let ρ : X 7→ Y t Z be any partition. Under the partition ρ, the variables396

x1, x2n could either be to the same side or get mapped to different sets, amongst Y and Z.397

We handle these cases separately.398

Let Y = Y \ ρ(x1), ρ(x2n) and Z = Z \ ρ(x1), ρ(x2n).399

Case 1: x1, x2n are in different parts.400

For ease of argument we set w1,i,2n = 0 for all i and let w1,2n = 1. Then we see that our401

polynomial is given by∑kr=0 (xr1xr2n) f2,2n−1wi,j . By invoking the induction hypothesis,402

we get that f2,2n−1 has a rank of at least (r + 1)min (|Y |,|Z|). Since∑kr=0 (xr1xr2n) and403

f2,2n−1 are defined over disjoint sets of variables, Observation 3 leads us to the following.404

Γρ(k∑r=0

(xr1xr2n) f2,2n−1) = Γρ(k∑r=0

(xr1xr2n)) · Γρ(f2,2n−1)405

≥ (r + 1) · (r + 1)min (|Y |,|Z|)406

= (r + 1)min(|Y |,|Z|).407408

Case 2: x1, x2n are in the same part. We know that there exists some k such that [1, k]409

and [k + 1, 2n] have equal size. We set w1,2n to zero. We can use our induction410

hypothesis on f1,k and fk+1,2n. Let us define Y1,k = ρ(x1), ρ(x2), . . . , ρ(xk) ∩ Y411

and Z1,k = ρ(x1), ρ(x2), . . . , ρ(xk) ∩ Z. Similarly, we can define Yk+1,2n =412


ρ(xk+1), ρ(xk+2), . . . , ρ(x2n) ∩ Y and Zk+1,2n = ρ(xk+1), ρ(xk+2), . . . , ρ(x2n) ∩ Z.413

By invoking the induction hypothesis, we get that Γρ(f1,k) ≥ (r + 1)min|Y1,k|,|Z1,k| and414

Γρ(fk+1,2n) ≥ (r + 1)min|Yk+1,2n|,|Zk+1,2n|. From the definition of the polynomial, f1,k415

and fk+1,2n are defined over disjoint sets of variables and thus using Observation 3 we416

get that417

Γρ(f1,2n) ≥ (r + 1)min|Y1,k|,|Z1,k| · (r + 1)min|Yk+1,2n|,|Zk+1,2n| ≥ (r + 1)min(|Y |,|Z|).418419

The last inequality follows from the fact that mina1, b1+mina2, b2 ≥ mina1+a2, b1+b2.420

It is easy to construct a polynomial sized circuit for Fn bottom-up by following the recursive421

definition and infer that it is in fact in VP. J422

For the moment, let us assume that the polynomial Fn/2 is being computed by a423

multi-r-ic depth four ΣΠΣΠµ circuit of size s < exp(

n280µr2+2

). Theorem 8 tells us that424

for all partitions ρ : X 7→ Y t Z that we consider, Γρ(Fn/2|ρ) is at least (1 + r)min|Y |,|Z|.425

On the other hand, we infer from Theorem 6 there is at least one partition ρ such that426

Γρ(Fn/2|ρ) is strictly less than (1 + r)min|Y |,|Z| and thus arriving at a contradiction to our427

assumption. This can formally be summarized as follows.428

I Theorem 9. Let n, r, µ be positive integers. Any syntactically multi-r-ic depth four429

ΣΠΣΠµ circuit that computes Fn (x1, x2, . . . , x2n) must be of size exp(

Ω(

n280µr2

)).430

To prove the main theorem, we also need the following lemma.431

I Lemma 10 (Analogous to Lemma 20.4 [26]). Let P be a multi-r-ic polynomial that is432

computed by a syntactically multi-r-ic depth 4 circuit C of size s ≤ nδµ for some δ > 0 .433

Let σ be a random restriction that sets each variable to zero independently with probability434

(1− n−2δ) . Then with probability at least (1− 1/s) the polynomial σ(P ) is computed by a435

multi-r-ic depth four circuit C ′ of bottom support at most µ and size s.436

Equipped with Theorem 10 and Theorem 9, we are now ready to prove our main theorem.437

I Theorem 11. Let n, r be positive integers and δ, c be some constants in (0, 1). Let the438

parameter p be equal to n−δ. If C be any syntactically multi-r-ic depth four circuit computing439

the polynomial Pn,p(X) as described above then it must be of size nΩ( lognr2 ).440

Proof. Let µ be a parameter that is set to γ lognr2 for a small constant γ. Let σ : X 7→ 0, ∗441

be a random restriction such that a variable is set to zero with a probability of (1− n−2δ),442

and is left untouched otherwise. Let C be a syntactically multi-r-ic depth four circuit of size443

s ≤ nδµ that computes Pn,p. Theorem 10 tells us that C ′ = σ(C) is a multi-r-ic depth four444

circuit of size s and bottom support at most µ. Hence, σ(Pn,p) has a multi-r-ic ΣΠΣΠµ445

size at most s. Since Fn is a p-projection of σ(Pn,p), Fn also has a multi-r-ic ΣΠΣΠµ of446

size at most s.447

On the other hand, from Theorem 9 we know that any multi-r-ic ΣΠΣΠµ circuit that448

computes Fn must be of size exp(

Ω(

n280µr2

)). Thus, it must be that exp

(Ω(

n280µr2

))≤449

s ≤ nδµ. We can choose γ to be a small enough constant such that the aforementioned450

expression is satisfied. Thus, s must at least be nΩ( lognr2 ). J451

References452

1 Manindra Agrawal and V. Vinay. Arithmetic circuits: A chasm at depth four. In proceedings453

of Foundations of Computer Science (FOCS), pages 67–75, 2008. doi:10.1109/FOCS.2008.32.454

http://dx.doi.org/10.1109/FOCS.2008.32


2 Noga Alon, Mrinal Kumar, and Ben Lee Volk. Unbalancing sets and an almost quadratic455

lower bound for syntactically multilinear arithmetic circuits. In CCC, volume 102 of LIPIcs,456

pages 11:1–11:16. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2018.457

3 Walter BAUR and Volker STRASSEN. The complexity of partial derivatives. Theoretical458

Computer Science, 22:317–330, 1983.459

4 Herman Chernoff. A measure of asymptotic efficiency for tests of a hypothesis based on460

the sum of observations. The Annals of Mathematical Statistics, 23(4):493–507, 1952. URL:461

http://www.jstor.org/stable/2236576.462

5 Suryajith Chillara, Christian Engels, Nutan Limaye, and Srikanth Srinivasan. A near-optimal463

depth-hierarchy theorem for small-depth multilinear circuits. In FOCS, pages 934–945. IEEE464

Computer Society, 2018.465

6 Suryajith Chillara, Nutan Limaye, and Srikanth Srinivasan. A quadratic size-hierarchy theorem466

for small-depth multilinear formulas. In ICALP, volume 107 of LIPIcs, pages 36:1–36:13.467

Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2018.468

7 Suryajith Chillara, Nutan Limaye, and Srikanth Srinivasan. Small-depth multilinear formula469

lower bounds for iterated matrix multiplication with applications. SIAM J. Comput., 48(1):70–470

92, 2019.471

8 Devdatt P. Dubhashi and Alessandro Panconesi. Concentration of Measure for the Analysis472

of Randomized Algorithms. Cambridge University Press, 2009. URL: http://www.cambridge.473

org/gb/knowledge/isbn/item2327542/.474

9 Hervé Fournier, Nutan Limaye, Guillaume Malod, and Srikanth Srinivasan. Lower bounds for475

depth 4 formulas computing iterated matrix multiplication. In proceedings of Symposium on476

Theory of Computing (STOC), pages 128–135, 2014. URL: http://doi.acm.org/10.1145/477

2591796.2591824.478

10 Dmitry Gavinsky, Shachar Lovett, Michael E. Saks, and Srikanth Srinivasan. A tail bound479

for read-k families of functions. Random Struct. Algorithms, 47(1):99–108, 2015. URL:480

https://doi.org/10.1002/rsa.20532, doi:10.1002/rsa.20532.481

11 Ankit Gupta, Pritish Kamath, Neeraj Kayal, and Ramprasad Saptharishi. Approaching the482

chasm at depth four. Journal of the ACM (JACM), 61(6):33, 2014.483

12 Wassily Hoeffding. Probability inequalities for sums of bounded random variables. Journal484

of the American Statistical Association, 58(301):13–30, 1963. URL: http://www.jstor.org/485

stable/2282952.486

13 Pavel Hrubeš and Amir Yehudayoff. Homogeneous formulas and symmetric polynomials.487

Computational Complexity, 20(3):559–578, 2011.488

14 Svante Janson. Poisson approximation for large deviations. Random Structures & Algorithms,489

1(2):221–229, 1990.490

15 KA Kalorkoti. A lower bound for the formula size of rational functions. SIAM Journal on491

Computing, 14(3):678–687, 1985.492

16 Neeraj Kayal, Vineet Nair, and Chandan Saha. Separation between read-once oblivious493

algebraic branching programs (ROABPs) and multilinear depth three circuits. In proceedings of494

Symposium on Theoretical Aspects of Computer Science (STACS), pages 46:1–46:15, 2016. URL:495

https://doi.org/10.4230/LIPIcs.STACS.2016.46, doi:10.4230/LIPIcs.STACS.2016.46.496

17 Neeraj Kayal, Chandan Saha, and Sébastien Tavenas. On the size of homogeneous and of depth-497

four formulas with low individual degree. Theory of Computing, 14(16):1–46, 2018. URL: http:498

//www.theoryofcomputing.org/articles/v014a016, doi:10.4086/toc.2018.v014a016.499

18 Mrinal Kumar, Rafael Mendes de Oliveira, and Ramprasad Saptharishi. Towards optimal500

depth reductions for syntactically multilinear circuits. Electronic Colloquium on Computational501

Complexity (ECCC), 26:19, 2019.502

19 Mrinal Kumar and Shubhangi Saraf. On the power of homogeneous depth 4 arithmetic circuits.503

In proceedings of Foundations of Computer Science (FOCS), 2014. doi:10.1109/FOCS.2014.504

46.505

http://www.jstor.org/stable/2236576

http://www.cambridge.org/gb/knowledge/isbn/item2327542/



http://doi.acm.org/10.1145/2591796.2591824

http://doi.acm.org/10.1145/2591796.2591824

http://doi.acm.org/10.1145/2591796.2591824

https://doi.org/10.1002/rsa.20532

http://dx.doi.org/10.1002/rsa.20532




https://doi.org/10.4230/LIPIcs.STACS.2016.46

http://dx.doi.org/10.4230/LIPIcs.STACS.2016.46

http://www.theoryofcomputing.org/articles/v014a016



http://dx.doi.org/10.4086/toc.2018.v014a016





20 Noam Nisan and Avi Wigderson. Lower bounds on arithmetic circuits via partial derivatives.506

Computational Complexity, 6(3):217–234, 1997. doi:10.1007/BF01294256.507

21 Ran Raz. Multilinear-NC2 6= multilinear-NC1. In proceedings of Foundations of Computer508

Science (FOCS), pages 344–351, 2004. URL: https://doi.org/10.1109/FOCS.2004.42, doi:509

10.1109/FOCS.2004.42.510

22 Ran Raz. Separation of multilinear circuit and formula size. Theory of Computing, 2(1):121–135,511

2006. doi:10.4086/toc.2006.v002a006.512

23 Ran Raz, Amir Shpilka, and Amir Yehudayoff. A lower bound for the size of syntactically513

multilinear arithmetic circuits. SIAM Journal of Computing, 38(4):1624–1647, 2008. doi:514

10.1137/070707932.515

24 Ran Raz and Amir Yehudayoff. Balancing syntactically multilinear arithmetic circuits. Com-516

putational Complexity, 17(4):515–535, 2008. doi:10.1007/s00037-008-0254-0.517

25 Ran Raz and Amir Yehudayoff. Lower bounds and separations for constant depth multilinear518

circuits. Computational Complexity, 18(2):171–207, 2009. doi:10.1007/s00037-009-0270-8.519

26 Ramprasad Saptharishi. A survey of lower bounds in arithmetic circuit complexity version520

8.0.4. Github survey, 2019. URL: https://github.com/dasarpmar/lowerbounds-survey/521

releases/.522

27 Amir Shpilka and Amir Yehudayoff. Arithmetic circuits: A survey of recent results and open523

questions. Foundations and Trends in Theoretical Computer Science, 5:207–388, March 2010.524

URL: http://dx.doi.org/10.1561/0400000039.525

28 Volker Strassen. Die berechnungskomplexität von elementarsymmetrischen funktionen und526

von interpolationskoeffizienten. Numerische Mathematik, 20(3):238–251, 1973.527

29 Sébastien Tavenas. Improved bounds for reduction to depth 4 and depth 3. Information and528

Computation, 240:2–11, 2015. doi:10.1016/j.ic.2014.09.004.529

30 L. G. Valiant. Completeness classes in algebra. In Proceedings of the Eleventh Annual ACM530

Symposium on Theory of Computing, STOC ’79, pages 249–261, New York, NY, USA, 1979.531

ACM. URL: http://doi.acm.org/10.1145/800135.804419, doi:10.1145/800135.804419.532

http://dx.doi.org/10.1007/BF01294256

https://doi.org/10.1109/FOCS.2004.42




http://dx.doi.org/10.4086/toc.2006.v002a006

http://dx.doi.org/10.1137/070707932

http://dx.doi.org/10.1137/070707932

http://dx.doi.org/10.1137/070707932

http://dx.doi.org/10.1007/s00037-008-0254-0

http://dx.doi.org/10.1007/s00037-009-0270-8

https://github.com/dasarpmar/lowerbounds-survey/releases/



http://dx.doi.org/10.1561/0400000039

http://dx.doi.org/10.1016/j.ic.2014.09.004

http://doi.acm.org/10.1145/800135.804419

http://dx.doi.org/10.1145/800135.804419

narfinger.github.io · 1 revisitinglowerboundsformulti-r-icdepth 2 fourcircuits 3 suryajith...

Documents