kolmogorov’s superposition theorem › ~xzhang › files › oct_2016_xiling.pdf ·...

22
Kolmogorov’s Superposition Theorem Xiling Zhang 06 Oct 2016 Xiling Zhang PG Colloquium 06 Oct 2016 1 / 14

Upload: others

Post on 29-May-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Kolmogorov’s Superposition Theorem › ~xzhang › files › oct_2016_xiling.pdf · Kolmogorov’s Superposition Theorem Xiling Zhang 06 Oct 2016 Xiling Zhang PG Colloquium 06 Oct

Kolmogorov’s Superposition Theorem

Xiling Zhang

06 Oct 2016

Xiling Zhang PG Colloquium 06 Oct 2016 1 / 14

Page 2: Kolmogorov’s Superposition Theorem › ~xzhang › files › oct_2016_xiling.pdf · Kolmogorov’s Superposition Theorem Xiling Zhang 06 Oct 2016 Xiling Zhang PG Colloquium 06 Oct

Hilbert’s 13th Problem

Algebraic equations (under a suitable transformation) of degree up to 6can be solved by functions of two variables. What about

x7 + ax3 + bx2 + cx + 1 = 0?

Hilbert’s conjecture: x(a, b, c) cannot be expressed by a superposition(sums and compositions) of bivariate functions.

Xiling Zhang PG Colloquium 06 Oct 2016 2 / 14

Page 3: Kolmogorov’s Superposition Theorem › ~xzhang › files › oct_2016_xiling.pdf · Kolmogorov’s Superposition Theorem Xiling Zhang 06 Oct 2016 Xiling Zhang PG Colloquium 06 Oct

Question: can every continuous (analytic, C∞, etc) function of nvariables be represented as a superposition of continuous (analytic, C∞,etc) functions of n − 1 variables?

Theorem (D. Hilbert)

There is an analytic function of three variables that cannot be expressed asa superposition of bivariate ones.

Theorem (A. Vitushkin)

∀n/α > n′/α′, α′ > 1, α, α′ /∈ N, there is an f ∈ C [α],α−[α](Rn) that isnot a superposition of functions in C [α′],α′−[α′](Rn′).

Xiling Zhang PG Colloquium 06 Oct 2016 3 / 14

Page 4: Kolmogorov’s Superposition Theorem › ~xzhang › files › oct_2016_xiling.pdf · Kolmogorov’s Superposition Theorem Xiling Zhang 06 Oct 2016 Xiling Zhang PG Colloquium 06 Oct

Question: can every continuous (analytic, C∞, etc) function of nvariables be represented as a superposition of continuous (analytic, C∞,etc) functions of n − 1 variables?

Theorem (D. Hilbert)

There is an analytic function of three variables that cannot be expressed asa superposition of bivariate ones.

Theorem (A. Vitushkin)

∀n/α > n′/α′, α′ > 1, α, α′ /∈ N, there is an f ∈ C [α],α−[α](Rn) that isnot a superposition of functions in C [α′],α′−[α′](Rn′).

Xiling Zhang PG Colloquium 06 Oct 2016 3 / 14

Page 5: Kolmogorov’s Superposition Theorem › ~xzhang › files › oct_2016_xiling.pdf · Kolmogorov’s Superposition Theorem Xiling Zhang 06 Oct 2016 Xiling Zhang PG Colloquium 06 Oct

Theorem (A. Kolmogorov, 1956; V. Arnold, 1957)

Given n ∈ Z+, every f0 ∈ C ([0, 1]n) can be reprensented as

f0(x1, x2, · · · , xn) =2n+1∑q=1

gq

n∑p=1

φpq(xp)

,

where φpq ∈ C [0, 1] are increasing functions independent of f0 andgq ∈ C [0, 1] depend on f0.

Can choose gq to be all the same gq ≡ g (Lorentz, 1966).

Can choose φpq to be Holder or Lipschitz continuous, but not C 1

(Fridman, 1967).

Can choose φpq = λpφq where λ1, · · · , λn > 0 and∑

p λp = 1(Sprecher, 1972).

Xiling Zhang PG Colloquium 06 Oct 2016 4 / 14

Page 6: Kolmogorov’s Superposition Theorem › ~xzhang › files › oct_2016_xiling.pdf · Kolmogorov’s Superposition Theorem Xiling Zhang 06 Oct 2016 Xiling Zhang PG Colloquium 06 Oct

Theorem (A. Kolmogorov, 1956; V. Arnold, 1957)

Given n ∈ Z+, every f0 ∈ C ([0, 1]n) can be reprensented as

f0(x1, x2, · · · , xn) =2n+1∑q=1

gq

n∑p=1

φpq(xp)

,

where φpq ∈ C [0, 1] are increasing functions independent of f0 andgq ∈ C [0, 1] depend on f0.

Can choose gq to be all the same gq ≡ g (Lorentz, 1966).

Can choose φpq to be Holder or Lipschitz continuous, but not C 1

(Fridman, 1967).

Can choose φpq = λpφq where λ1, · · · , λn > 0 and∑

p λp = 1(Sprecher, 1972).

Xiling Zhang PG Colloquium 06 Oct 2016 4 / 14

Page 7: Kolmogorov’s Superposition Theorem › ~xzhang › files › oct_2016_xiling.pdf · Kolmogorov’s Superposition Theorem Xiling Zhang 06 Oct 2016 Xiling Zhang PG Colloquium 06 Oct

Proof (J-P. Kahane, 1975): Let Φ := φ ∈ C [0, 1]: φ increasing,φ(0) = 0, φ(1) = 1 and λ1, · · · , λn > 0 be distinct and sum up to 1.

Let ε > 0 to be determined. For each f ∈ C ([0, 1]n), f 6≡ 0, consider theset Ω(f ) of (φ1, · · · , φ2n+1) s.t. ∃h ∈ C [0, 1] s.t. ‖h‖ 6 ‖f ‖ and∣∣∣∣∣∣f (x1, · · · , xn)−

2n+1∑q=1

h

n∑p=1

λpφq(xp)

∣∣∣∣∣∣ < (1− ε)‖f ‖. (∗)

Ω(f ) is clearly an open set in Φ2n+1. If it is also dense (non-empty), thenconsider an element (φ1, · · · , φ2n+1) in the set

⋂f ∈F Ω(f ), where F is a

countable dense subset of C ([0, 1]n) not containing f ≡ 0.

Xiling Zhang PG Colloquium 06 Oct 2016 5 / 14

Page 8: Kolmogorov’s Superposition Theorem › ~xzhang › files › oct_2016_xiling.pdf · Kolmogorov’s Superposition Theorem Xiling Zhang 06 Oct 2016 Xiling Zhang PG Colloquium 06 Oct

Proof (J-P. Kahane, 1975): Let Φ := φ ∈ C [0, 1]: φ increasing,φ(0) = 0, φ(1) = 1 and λ1, · · · , λn > 0 be distinct and sum up to 1.

Let ε > 0 to be determined. For each f ∈ C ([0, 1]n), f 6≡ 0, consider theset Ω(f ) of (φ1, · · · , φ2n+1) s.t. ∃h ∈ C [0, 1] s.t. ‖h‖ 6 ‖f ‖ and∣∣∣∣∣∣f (x1, · · · , xn)−

2n+1∑q=1

h

n∑p=1

λpφq(xp)

∣∣∣∣∣∣ < (1− ε)‖f ‖. (∗)

Ω(f ) is clearly an open set in Φ2n+1. If it is also dense (non-empty), thenconsider an element (φ1, · · · , φ2n+1) in the set

⋂f ∈F Ω(f ), where F is a

countable dense subset of C ([0, 1]n) not containing f ≡ 0.

Xiling Zhang PG Colloquium 06 Oct 2016 5 / 14

Page 9: Kolmogorov’s Superposition Theorem › ~xzhang › files › oct_2016_xiling.pdf · Kolmogorov’s Superposition Theorem Xiling Zhang 06 Oct 2016 Xiling Zhang PG Colloquium 06 Oct

Then for all f0 ∈ C ([0, 1]n), f0 6≡ 0, ∃f ∈ F s.t. ‖f ‖ 6 ‖f0‖ and‖f0 − f ‖ 6 ε

2‖f0‖, and a function h satisfying ‖h‖ 6 ‖f ‖ and (∗).

Write h = γ(f0) and put γ(0) = 0. Define, by induction, hj = γ(fj) and

fj+1(x1, · · · , xp) = fj(x1, · · · , xp)−2n+1∑q=1

hj

n∑p=1

λpφq(xp)

.

By (∗) the series∑∞

j=0 hj converges in C [0, 1] to g , et voila.

By Baire’s category theorem, Kolmogorov’s representation holds for quasiall (φ1, · · · , φ2n+1)!

It remains to show that Ω(f ) is dense in Φ2n+1.

Xiling Zhang PG Colloquium 06 Oct 2016 6 / 14

Page 10: Kolmogorov’s Superposition Theorem › ~xzhang › files › oct_2016_xiling.pdf · Kolmogorov’s Superposition Theorem Xiling Zhang 06 Oct 2016 Xiling Zhang PG Colloquium 06 Oct

Let G 6= ∅ be an open in Φ2n+1 and δ > 0 to be defined. Denote for j ∈ Z,q = 1, · · · , 2n + 1, Iq = Iq(j) = [qδ + (2n + 1)jδ, qδ + (2n + 1)jδ + 2nδ].

For each fixed q, Iq(j) are disjoint and separated by δ.

∀x ∈ [0, 1] appear in one Iq for all values of q except for at most one.

Define Pq = Pq(j1, · · · , jn) the set of all cubes Iq(j1)× · · · × Iq(jn), thenall points in [0, 1]n appear in one Pq for all values of q except for at mostn, i.e., for at least n + 1 values of q.

Let ∆ ⊂ Φ2n+1 s.t. each φq is constant on each Iq and linear betweenIq(j) and Iq(j + 1). Then choose δ = δ(G , ε, f ) s.t.

ωf (Pq) = supPqf − infPq f 6 ε‖f ‖;

G ∩∆ 6= ∅.

Xiling Zhang PG Colloquium 06 Oct 2016 7 / 14

Page 11: Kolmogorov’s Superposition Theorem › ~xzhang › files › oct_2016_xiling.pdf · Kolmogorov’s Superposition Theorem Xiling Zhang 06 Oct 2016 Xiling Zhang PG Colloquium 06 Oct

Let G 6= ∅ be an open in Φ2n+1 and δ > 0 to be defined. Denote for j ∈ Z,q = 1, · · · , 2n + 1, Iq = Iq(j) = [qδ + (2n + 1)jδ, qδ + (2n + 1)jδ + 2nδ].

For each fixed q, Iq(j) are disjoint and separated by δ.

∀x ∈ [0, 1] appear in one Iq for all values of q except for at most one.

Define Pq = Pq(j1, · · · , jn) the set of all cubes Iq(j1)× · · · × Iq(jn), thenall points in [0, 1]n appear in one Pq for all values of q except for at mostn, i.e., for at least n + 1 values of q.

Let ∆ ⊂ Φ2n+1 s.t. each φq is constant on each Iq and linear betweenIq(j) and Iq(j + 1). Then choose δ = δ(G , ε, f ) s.t.

ωf (Pq) = supPqf − infPq f 6 ε‖f ‖;

G ∩∆ 6= ∅.

Xiling Zhang PG Colloquium 06 Oct 2016 7 / 14

Page 12: Kolmogorov’s Superposition Theorem › ~xzhang › files › oct_2016_xiling.pdf · Kolmogorov’s Superposition Theorem Xiling Zhang 06 Oct 2016 Xiling Zhang PG Colloquium 06 Oct

Let G 6= ∅ be an open in Φ2n+1 and δ > 0 to be defined. Denote for j ∈ Z,q = 1, · · · , 2n + 1, Iq = Iq(j) = [qδ + (2n + 1)jδ, qδ + (2n + 1)jδ + 2nδ].

For each fixed q, Iq(j) are disjoint and separated by δ.

∀x ∈ [0, 1] appear in one Iq for all values of q except for at most one.

Define Pq = Pq(j1, · · · , jn) the set of all cubes Iq(j1)× · · · × Iq(jn), thenall points in [0, 1]n appear in one Pq for all values of q except for at mostn, i.e., for at least n + 1 values of q.

Let ∆ ⊂ Φ2n+1 s.t. each φq is constant on each Iq and linear betweenIq(j) and Iq(j + 1). Then choose δ = δ(G , ε, f ) s.t.

ωf (Pq) = supPqf − infPq f 6 ε‖f ‖;

G ∩∆ 6= ∅.

Xiling Zhang PG Colloquium 06 Oct 2016 7 / 14

Page 13: Kolmogorov’s Superposition Theorem › ~xzhang › files › oct_2016_xiling.pdf · Kolmogorov’s Superposition Theorem Xiling Zhang 06 Oct 2016 Xiling Zhang PG Colloquium 06 Oct

Fix an element (φ1, · · · , φ2n+1) ∈ G ∩∆. By (necessarily) modifying theφq a little bit, the function χq(x1, · · · , xn) :=

∑np=1 λpφq(xp) takes

different (constant) values on different Pq (since λp’s are distinct).

For all Pq set h(χq(Pq)) = 2εf (Pq); it’s well defined since (q, j1, · · · , jn)→ χq(Pq(j1, · · · , jn)) is injective. Extend it onto [0, 1] s.t. ‖h‖ 6 2ε‖f ‖.

For any point x ∈ [0, 1]n, if x ∈ Pq, then h(χq(x)) = 2εf (x) + ρ, where|ρ| = 2ε|f − f | 6 2ε2‖f ‖. Since x appear in at least n + 1 cubes Pq, bychoosing ε < (2n + 1)−1 we have, by the triangle inequality,∣∣∣∣∣∣f (x)−

2n+1∑q=1

h(χq(x))

∣∣∣∣∣∣ 6(1− (2n + 1)ε)|f (x)|+ 2(n + 1)ε2‖f ‖+ 2nε‖f ‖

6 (1− ε)‖f ‖.

That validates (∗) and so G ∩∆ ⊂ Ω(f ). Take G = (Φ2n+1) and thedensity follows from the density of ∆.

Xiling Zhang PG Colloquium 06 Oct 2016 8 / 14

Page 14: Kolmogorov’s Superposition Theorem › ~xzhang › files › oct_2016_xiling.pdf · Kolmogorov’s Superposition Theorem Xiling Zhang 06 Oct 2016 Xiling Zhang PG Colloquium 06 Oct

Fix an element (φ1, · · · , φ2n+1) ∈ G ∩∆. By (necessarily) modifying theφq a little bit, the function χq(x1, · · · , xn) :=

∑np=1 λpφq(xp) takes

different (constant) values on different Pq (since λp’s are distinct).

For all Pq set h(χq(Pq)) = 2εf (Pq); it’s well defined since (q, j1, · · · , jn)→ χq(Pq(j1, · · · , jn)) is injective. Extend it onto [0, 1] s.t. ‖h‖ 6 2ε‖f ‖.

For any point x ∈ [0, 1]n, if x ∈ Pq, then h(χq(x)) = 2εf (x) + ρ, where|ρ| = 2ε|f − f | 6 2ε2‖f ‖. Since x appear in at least n + 1 cubes Pq, bychoosing ε < (2n + 1)−1 we have, by the triangle inequality,∣∣∣∣∣∣f (x)−

2n+1∑q=1

h(χq(x))

∣∣∣∣∣∣ 6(1− (2n + 1)ε)|f (x)|+ 2(n + 1)ε2‖f ‖+ 2nε‖f ‖

6 (1− ε)‖f ‖.

That validates (∗) and so G ∩∆ ⊂ Ω(f ). Take G = (Φ2n+1) and thedensity follows from the density of ∆.

Xiling Zhang PG Colloquium 06 Oct 2016 8 / 14

Page 15: Kolmogorov’s Superposition Theorem › ~xzhang › files › oct_2016_xiling.pdf · Kolmogorov’s Superposition Theorem Xiling Zhang 06 Oct 2016 Xiling Zhang PG Colloquium 06 Oct

“The Baire category is a profound triviality which condenses the folkwisdom of a generation of ingenious mathematicians into a singlestatement.”

T. W. Korner, “Linear Analysis”, Sect. 6, p. 13.

Xiling Zhang PG Colloquium 06 Oct 2016 9 / 14

Page 16: Kolmogorov’s Superposition Theorem › ~xzhang › files › oct_2016_xiling.pdf · Kolmogorov’s Superposition Theorem Xiling Zhang 06 Oct 2016 Xiling Zhang PG Colloquium 06 Oct

Geometric Interpretation (Doss, Hedberg, Kahane)

Let Γp be the “increasing” curve in R2n+1 defined by

Xq = φpq(t), t ∈ [0, 1], q = 1, · · · , 2n + 1,

and consider the algebraic sum of these n curves E = Γ1 + · · ·+ Γn.Kolmogorov’s theorem then says:

the map Γ1 × · · · × Γn → E is one-one - E is a distorted cube;

E is an interpolation set: every continuous function on E can bewritten in the form g(X1) + · · ·+ g(Xn) where g is continuous.

Theorem (J-P. Kahane, 1980)

Consider increasing curves Γ1, · · · , Γn in Rd . The map Γ1 × · · · × Γn → Eis quasi surely one-one iff d > 2n + 1

Xiling Zhang PG Colloquium 06 Oct 2016 10 / 14

Page 17: Kolmogorov’s Superposition Theorem › ~xzhang › files › oct_2016_xiling.pdf · Kolmogorov’s Superposition Theorem Xiling Zhang 06 Oct 2016 Xiling Zhang PG Colloquium 06 Oct

Geometric Interpretation (Doss, Hedberg, Kahane)

Let Γp be the “increasing” curve in R2n+1 defined by

Xq = φpq(t), t ∈ [0, 1], q = 1, · · · , 2n + 1,

and consider the algebraic sum of these n curves E = Γ1 + · · ·+ Γn.Kolmogorov’s theorem then says:

the map Γ1 × · · · × Γn → E is one-one - E is a distorted cube;

E is an interpolation set: every continuous function on E can bewritten in the form g(X1) + · · ·+ g(Xn) where g is continuous.

Theorem (J-P. Kahane, 1980)

Consider increasing curves Γ1, · · · , Γn in Rd . The map Γ1 × · · · × Γn → Eis quasi surely one-one iff d > 2n + 1

Xiling Zhang PG Colloquium 06 Oct 2016 10 / 14

Page 18: Kolmogorov’s Superposition Theorem › ~xzhang › files › oct_2016_xiling.pdf · Kolmogorov’s Superposition Theorem Xiling Zhang 06 Oct 2016 Xiling Zhang PG Colloquium 06 Oct

Any Explicit Constructions for λp, φq and g?

For γ ∈ Z+ consider the γ-rationals in [0, 1]: Qγ :=⋃∞

k=1Qγk , where

Qγk :=β(i , k) =

∑kr=1 irγ

−r : ir ∈ 0, 1, · · · , γ − 1

.

Definition (Sprecher’s function, 1996)

Let φ : [0, 1]→ R be the unique continuous function s.t. ∀β(i , k) ∈ Qγk ,

φ(β(i , k)) =k∑

r=1

ir2−mγ−nr−mr−1

n−1 ,

where ir := ir − (γ − 2) 〈ir 〉 , mr := 〈ir 〉(

1 +∑r

s=1

∑r−1t=s [it ]

), ∀r > 1,

with 〈i1〉 = [i1] := 0, and 〈ir 〉 := 1ir=γ−1, [ir ] := 1ir>γ−2 for r > 2.

Xiling Zhang PG Colloquium 06 Oct 2016 11 / 14

Page 19: Kolmogorov’s Superposition Theorem › ~xzhang › files › oct_2016_xiling.pdf · Kolmogorov’s Superposition Theorem Xiling Zhang 06 Oct 2016 Xiling Zhang PG Colloquium 06 Oct

Definition (Kolmogorov maps)

λ1 := 12 , λp = 1

2

∑∞r=1 γ

−(p−1) nr−1n−1 for p = 2, · · · , n; λ :=

∑np=1 λp.

φq(x) := 14n+2φ

(x + q

γ(γ−1)

)+ q

(2n+1)λ , q = 0, 1, · · · , 2n.

ξq(x1, · · · , xn) :=∑n

p=1 λpφq(xq), q = 0, 1, · · · , 2n.

The ξq’s take different values on different γ-rational cubes Pq.

Given f ∈ C ([0, 1]n), similar to Kahane’s proof let

g(x) :=1

n + 1f (cq(j1, · · · , jn)),

for all x ∈ ξq(Pq(j1, · · · , n)), where cq(j1, · · · , jn) is the center of eachcube, and linearise in between cubes.

Xiling Zhang PG Colloquium 06 Oct 2016 12 / 14

Page 20: Kolmogorov’s Superposition Theorem › ~xzhang › files › oct_2016_xiling.pdf · Kolmogorov’s Superposition Theorem Xiling Zhang 06 Oct 2016 Xiling Zhang PG Colloquium 06 Oct

Definition (Kolmogorov maps)

λ1 := 12 , λp = 1

2

∑∞r=1 γ

−(p−1) nr−1n−1 for p = 2, · · · , n; λ :=

∑np=1 λp.

φq(x) := 14n+2φ

(x + q

γ(γ−1)

)+ q

(2n+1)λ , q = 0, 1, · · · , 2n.

ξq(x1, · · · , xn) :=∑n

p=1 λpφq(xq), q = 0, 1, · · · , 2n.

The ξq’s take different values on different γ-rational cubes Pq.

Given f ∈ C ([0, 1]n), similar to Kahane’s proof let

g(x) :=1

n + 1f (cq(j1, · · · , jn)),

for all x ∈ ξq(Pq(j1, · · · , n)), where cq(j1, · · · , jn) is the center of eachcube, and linearise in between cubes.

Xiling Zhang PG Colloquium 06 Oct 2016 12 / 14

Page 21: Kolmogorov’s Superposition Theorem › ~xzhang › files › oct_2016_xiling.pdf · Kolmogorov’s Superposition Theorem Xiling Zhang 06 Oct 2016 Xiling Zhang PG Colloquium 06 Oct

Can We Compute Them?

Theorem

If the integers n > 2 and γ > 2n + 2, then the Sprecher’s function andKolmogorov maps are all computable. Thus the above construction gives acomputable representation.

Let ρ be the Cauchy representation of R, i.e. ρ(x) = (qj) ∈ QN convergingto x rapidly: |qk − qj | 6 2−k , ∀j > k.

Definition (Computable functions)

A function f : R→ R is computable if there exists a Turing machine Mwith one-way output tape and alphabet Σ that computes a functionFM : Σω → Σω s.t. ρ FM(a) = f ρ(a) for all a ∈dom(f ρ).

Xiling Zhang PG Colloquium 06 Oct 2016 13 / 14

Page 22: Kolmogorov’s Superposition Theorem › ~xzhang › files › oct_2016_xiling.pdf · Kolmogorov’s Superposition Theorem Xiling Zhang 06 Oct 2016 Xiling Zhang PG Colloquium 06 Oct

References

1 J-P. Kahane, Sur le Theoreme de Superposition de Kolmogorov,Journal of Approximation Theory 13 229-234, 1975.

2 J-P. Kahane, Baire’s Category Theorem and Trigonometric Series,Journal d’Analyse Mathematique, vol 80, 2000.

3 J-P. Kahane, Sur le Treizieme Probleme de Hilbert, Le Theoreme deSuperposition de Kolmogorov, et les Sommes Algebrique d’ArcsCroissants, Harmonic Analyis Iraklion 1978 Proceedings, LectureNotes in Math. 781, Springer-Verlag, Berlin, 1980, pp. 76-101.

4 E. Charpentier, A. Lesne, N. Nikolski, Kolmogorov’s Heritage inMathematics, Springer, 2004

Xiling Zhang PG Colloquium 06 Oct 2016 14 / 14