university of alberta library release form name of author...
TRANSCRIPT
University of Alberta
Library Release Form
Name of Author: Ryan Gordon Trelford
Title of Thesis: The Largest k-Ball in an n-Box
Degree: Master of Science
Year This Degree Granted: 2008
Permission is hereby granted to the University of Alberta Library to reproduce single
copies of this thesis and to lend or sell such copies for private, scholarly, or scientific
research purposes only.
The author reserves all other publication and other rights in association with the
copyright in the thesis, and except as herein before provided, neither the thesis
nor any substantial portion thereof may be printed or otherwise reproduced in any
material form whatever without the author’s prior written permission.
Department of Mathematical and Statistical Sciences
University of Alberta
Edmonton, Alberta
Canada, T6G 2G1
Date:
University of Alberta
THE LARGEST k-BALL IN AN n-BOX
by
Ryan Gordon Trelford
A thesis submitted to the Faculty of Graduate Studies and Research
in partial fulfillment of the requirements for the degree of
Master of Science
in
Mathematics
Department of Mathematical and Statistical Sciences
Edmonton, Alberta
Fall, 2008
University of Alberta
Faculty of Graduate Studies and Research
The undersigned certify that they have read and recommend to the Faculty of
Graduate Studies and Research for acceptance, a thesis entitled The Largest k-
Ball in an n-Box submitted by Ryan Gordon Trelford in partial fulfillment of
the requirements for the degree of Master of Science in Mathematics
Dr. D. Hrimiuc (Supervisor)
Dr. W. Krawcewicz (Chair, Examiner)
Dr. M. Jagersand (Computing Science)
Date:
For my family:
Debra, Gordon, Erin, Jim and Christine
and
Angela
ABSTRACT
We develop a formula for the radius of the largest k-dimensional ball that can be
contained inside an n-dimensional box, where k = 1, . . . , n, and develop an algorithm
to give the equations for every k-dimensional flat that contains one of these balls of
maximal radius.
ACKNOWLEDGEMENTS
I would like to give my k-thanks to the following n-people (k = many, n = 14):
First and foremost, my supervisor, Dr. Dragos Hrimiuc, for his unending sup-
port, guidance, patience, and for sharing so much of his knowledge with me. This
thesis would not have been possible without his much appreciated input, enthusi-
asm, and dedication.
My sincere thanks also go to Angela Beltaos for allowing me to explain my work
to her and for always being there and believing in me.
Dr. Mohammad Akbar, Dr. Ken Andersen, Dr. Wieslaw Krawcewicz, and Dr.
Haibo Ruan for their encouragement and friendship during my program.
Andrew Beltaos, Jenn Gamble, Brian Ketelboeter, and Dr. Marton Naszodi for
their interest in my research (or at least pretending to be interested) and asking
many provocative questions.
Rick Mikalonis and Patti Bobowsky for always ensuring that I had funding dur-
ing my program, and Tara Schuetz for making sure my program stayed on track
during my time here.
Finally, my thanks go to Hongtao Yang for creating the LATEX file mathesis.sty,
which greatly aided in the writing of this thesis.
Table of Contents
1 Introduction 1
2 Gram Determinants and the Distance Between Flats 3
2.1 The Gram Determinant: Definition and Properties . . . . . . . . . . 3
2.2 Distance Formulae . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 The Largest Ball in a Box Problem 10
3.1 Notation and Basic Ideas . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2 The (n− 1, n) Problem . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2.1 A Formula for dj . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2.2 Finding r(n− 1, n) and M . . . . . . . . . . . . . . . . . . . 14
3.2.3 Contact Points . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.2.4 The (2, 3) Problem . . . . . . . . . . . . . . . . . . . . . . . . 21
3.3 The (k, n) Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.3.1 A Formula for dj . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.3.2 Choosing an Orthonormal Basis . . . . . . . . . . . . . . . . 27
3.3.3 Finding r(k, n) and M . . . . . . . . . . . . . . . . . . . . . . 44
3.3.4 The (k, 4) Problem . . . . . . . . . . . . . . . . . . . . . . . . 48
Bibliography 52
A An Algorithm to Find m1, . . . ,mn−k 53
Chapter 1
Introduction
This thesis will deal with the problem of finding the radius and the location of the
largest k-dimensional euclidian ball in an n-dimensional box where 1 ≤ k ≤ n. A
precise formula for the radius of such a ball was first found in [2] by Hazel Everett
(Universite du Quebec a Montreal, Departemente d’Informatique), Ivan Stojmen-
ovic (University of Ottawa, Department of Computer Science), Pavel Valtr (Charles
University, Department of Applied Mathematics), and Sue Whitesides (McGill Uni-
versity, School of Computer Science). After showing that there exists a ball of
maximal radius with its center at the origin, they approached the problem by calcu-
lating the distance from the origin to the intersection of an arbitrary k-dimensional
linear subspace with each of the hyperplanes containing the faces of the box. This
was done in terms of k vectors v1, . . . ,vk which formed an orthonormal basis for
the k-dimensional linear subspace. They then gave conditions on these distances
which determined if a k-dimensional linear subspace contained a k-dimensional ball
of maximal radius in the box. However, the locations of these k-dimensional balls
of maximal radius (or equivalently, the k-dimensional linear subspaces containing
them) were not found, but were rather left as an unsolved problem. To date, no
such solution appears in the literature. Here, we will again calculate the radius
of the largest k-dimensional ball in an n-dimensional box, but we will use n − korthonormal vectors m1, . . . ,mn−k which form an orthonormal basis of the n − k-
dimensional subspace orthogonal to the k-dimensional subspaces mentioned above.
We will obtain the results of [2], but in a far more geometrical way, and we will also
derive the locations of the k-dimensional subspaces where these k-dimensional balls
1
of maximal radius lie.
Since distance will play an important role in this thesis, we devote all of chapter
2 to this topic. Section 2.1 will define the Gram determinant, and list some of its
properties. The section following this will show how to employ the Gram determi-
nant to find the distance between affine subspaces of any dimension and show the
equivalence of this method with one that was previously known (see [3], [1] and [4]).
In chapter 3, we address the problem of finding the largest k-dimensional ball
in an n-dimensional box. Section 3.1 will introduce some terminology, as well as
state a couple of useful facts. In section 3.2, we will find the radius and locations
of the largest n − 1 dimensional balls in an n-dimensional box, as well as stating
the points where these balls of maximal radius contact the faces of the box. We
generalize these results in section 3.3, where we find the radius and locations of the
largest k-dimensional balls in an n-dimensional box for any k = 1, . . . , n− 1.
Since calculating specific solutions by hand will become increasingly unrealistic
as n− k becomes large, we include an algorithm in the Appendix that will quickly
do the calculations.
2
Chapter 2
Gram Determinants and the
Distance Between Flats
2.1 The Gram Determinant: Definition and Properties
We will begin with a brief study of the Gram determinant and its properties, which
will be useful in the next chapter. We start with a couple of useful theorems from
linear algebra which can be found in [5].
Theorem 2.1 (Gram-Schmidt Process). Given a linearly independant set
{u1, . . .uk} ⊂ Rn, define
v1 = u1,
v2 = u2 −〈u2, v1〉〈v1, v1〉
v1,
v3 = u3 −〈u3, v1〉〈v1, v1〉
v1 −〈u3, v2〉〈v2, v2〉
v2,
...
vk = uk −〈uk, v1〉〈v1, v1〉
v1 −〈uk, v2〉〈v2, v2〉
v2 − . . .−〈uk, vk−1〉〈vk−1, vk−1〉
vk−1.
Then {v1, . . . , vk} is an orthogonal set and span{v1, . . . , vp} = span{u1, . . . ,up} for
1 ≤ p ≤ k.
Remark 2.1. We can reformulate Theorem 2.1 by saying that v1 = u1 and for every
k = 2, . . . , n, vk = uk − projVk−1uk where Vk−1 = span{v1, . . . ,vk−1}.
3
Theorem 2.2 (QR Factorization). If A is an m× n matrix with linearly inde-
pendent columns u1, . . . ,un, then A can be uniquely factored as A = QR, where Q
is an m × n matrix whose columns, v1, . . . , vn, form an orthonormal basis for the
column space of A, and R is an n × n upper triangular matrix with r11 = ‖v1‖,rjj = ‖vj − projVj−1
vj‖ for j = 2, . . . , n, and rij = 〈ui, vj〉 for i < j, where
Vj−1 = span{v1, . . . , vj−1}.
Remark 2.2. Setting m = n in theorem 2.2 and keeping in mind that Q is an
orthogonal matrix and R is an upper triangular matrix with positive entries on its
main diagonal, we can write the absolute value of the determinant of A as:
| det(A)| = |det(QR)| = |det(Q)||det(R)| = | ± 1||det(R)| = det(R)
= ‖v1‖‖v2 − projV1v2‖ · · · ‖vn − projVn−1
vn‖.
Definition 2.1 (Gram Matrix, Gram Determinant). Let v1, . . . ,vk ∈ Rn
be the columns of an n × k matrix A. The matrix A>A is known as the Gram
matrix of the vectors v1, . . . ,vk and is denoted by G(v1, . . . ,vk). The determinant
of A>A is called the Gram determinant of the vectors v1, . . . ,vk and is denoted by
Gram(v1, . . . ,vk). Thus,
Gram(v1, . . . ,vk) = det(A>A) =
∣∣∣∣∣∣∣∣∣〈v1,v1〉 . . . 〈v1,vk〉
.... . .
...
〈vk,v1〉 . . . 〈vk,vk〉
∣∣∣∣∣∣∣∣∣.Remark 2.3. If A is an n × k matrix with linearly independant columns, then we
can apply the QR factorization:
A>A = (QR)>QR
= R>Q>QR
= R>IR
= R>R,
4
so that
det(A>A) = det(R>R)
= det(R>) det(R)
= (det(R))2
= ‖v1‖2‖v2 − projV1v2‖2 . . . ‖vk − projVk−1
vk‖2.
Theorem 2.3 (Properties of the Gram Determinant). Let v1, . . . , vk be
vectors in Rn. We have the following:
1. Gram(vσ(1), . . . , vσ(k)) = Gram(v1, . . . , vk) where σ is any permutation of
{1, . . . , k}.
2. Gram(v1, . . . , αvi, . . . , vk) = α2 Gram(v1, . . . , vi, . . . , vk) for every α ∈ R and
for every i = 1, . . . , k.
3. Gram(v1, . . . , vk) = 0 if and only if {v1, . . . , vk} is a linearly dependant set.
4. 0 < Gram(v1, . . . , vk) ≤∏ki=1 ‖vi‖2 for a linearly independant set {v1, . . . , vk}.
Proof.
1. It will suffice to show
Gram(v1, . . . ,vi, . . . ,vj , . . . ,vk) = Gram(v1, . . . ,vj , . . . ,vi, . . . ,vk).
By interchanging ith and jth vectors, we interchange the ith and jth rows and
the ith and jth columns of the Gram matrix. Thus, the Gram determinant
changes by a factor of (−1)2.
2. Multiplying the ith vector by α results in multiplying the ith row and the ith
column of the Gram matrix by α. Thus, the Gram determinant changes by a
factor of α2.
3. Let A be an n×k matrix with v1, . . . ,vk as it columns. If Gram(v1, . . . ,vk) =
0 then det(A>A) = 0. Then there must be one column of A>A, say the ith
5
column, which is a linear combination of the other columns:
(2.1)
〈v1,vi〉
...
〈vk,vi〉
=∑j 6=i
cj
〈v1,vj〉
...
〈vk,vj〉
=
〈v1,
∑j 6=i cjvj〉...
〈vk,∑
j 6=i cjvj〉
.
Therefore, for l = 1, . . . , k, 〈vl,vi−∑
j 6=i vj〉 = 0 so that vl ⊥(vi −
∑j 6=i vj
).
Setting v = vi −∑
j 6=i cjvj , we have v ∈ span{v1, . . . ,vk}. But since vl ⊥ v
for l = 1, . . . , k, v = 0 and vi =∑
j 6=i cjvj and hence {v1, . . . ,vk} is linearly a
dependant set. On the other hand, if the set {v1, . . . ,vk} is linearly dependant,
then there is at least one vector, say vi such that vi =∑
j 6=i vj . It then follows
from (2.1) that the ith column of A>A is a linear combination of the other
columns so that det(A>A) = Gram(v1, . . . ,vk) = 0.
4. Gram(v1, . . . ,vk) = ‖v1‖2‖v2 − projV1v2‖2 . . . ‖vk − projVk−1
vk‖2 > 0 since
{v1, . . . ,vk} is linearly independant. We will show that for every x ∈ Rn and
for every subspace L of Rn we have ‖x−projL x‖2 ≤ ‖x‖2. Since Rn = L⊕L⊥,
we have that x can be uniquely written as x = projL x + projL⊥ x. Since
L ⊥ L⊥, Pythagoras’ theorem yields ‖x‖2 = ‖ projL x‖2 + ‖projL⊥ x‖2. But
projL⊥ x = x − projL x gives ‖x‖2 = ‖ projL x‖2 + ‖x − projL x‖2 so that
‖x− projL x‖2 ≤ ‖x‖2. Thus, Gram(v1, . . . ,vk) ≤∏ki=1 ‖vi‖2.
Remark 2.4. Note that ‖vi−projVi−1vi‖2 = ‖vi‖2 if and only if projVi−1
vi = 0 if and
only if vi ⊥ Vi−1. Thus, Gram(v1, . . . ,vk) =∏ki=1 ‖vi‖2 if and only if {v1, . . . ,vk}
is an orthogonal set.
Remark 2.5. Remark 2.3 states that for a linearly independent set {v1, . . . ,vk}, we
have that
(2.2) Gram(v1, . . . ,vk) = ‖v1‖2‖v2 − projV1v2‖2 . . . ‖vk − projVk−1
vk‖2.
However, if {v1, . . . ,vk} is a linearly dependant set, then there is an i such that
vi ∈ Vi−1. It follows that ‖vi− projVi−1vi‖ = 0 which of course implies ‖v1‖2‖v2−
projV1v2‖2 . . . ‖vk − projVk−1
vk‖2 = 0. Combining this with part 3 of theorem 2.3
6
verifies (2.2) for any set {v1, . . . ,vk}.
2.2 Distance Formulae
In this section, we will be concerned with finding the distance between two flats,
given that we know a basis for both of their underlying vector spaces as well as one
point lying in each flat.
Theorem 2.4 (Distance from a point to a flat). Let L = p + L be a k-
dimensional flat in Rn, and let x ∈ Rn. Then the distance from x to L is given
by
(2.3) dist(x,L) =
√Gram(x− p, v1, . . . , vk)
Gram(v1, . . . , vk)
where {v1, . . . , vk} is a basis for L.
Proof.
dist(x,L) = dist(x,p + L) = dist(x− p, L)
= ‖(x− p)− projL(x− p)‖
=
√Gram(v1, . . . ,vk)‖(x− p)− projL(x− p)‖2
Gram(v1, . . . ,vk)
=
√Gram(v1, . . . ,vk,x− p)
Gram(v1, . . . ,vk)
=
√Gram(x− p,v1, . . . ,vk)
Gram(v1, . . . ,vk).
Theorem 2.5 (Distance between Arbitrary flats). Let L1 = p1 + L1,L2 =
p2 + L2 be any two flats in Rn. Then the distance from L1 to L2 is given by
(2.4) dist(L1,L2) =
√Gram(p1 − p2, v1, . . . , vk)
Gram(v1, . . . , vk)
where {v1, . . . , vk} is a basis for L1 + L2.
7
Proof.
dist(L1,L2) = dist(p1 + L1,p2 + L2)
= infx∈L1,y∈L2
dist(p1 + x,p2 + y)
= infx∈L1,y∈L2
dist(p1 − p2,y− x)
= infx∈L1,y∈L2
dist(p1 − p2,x + y)
= inft∈L1+L2
dist(p1 − p2, t)
= dist(p1 − p2, L1 + L2).
But from the proof of theorem 2.4, we have
dist(p1 − p2, L1 + L2) =
√Gram(p1 − p2,v1, . . . ,vk)
Gram(v1, . . . ,vk)
where {v1, . . . ,vk} is a basis for L1 + L2.
The following alternate distance formula can be found in [3], [1] and [4]. Let A
be an m1×n matrix and B be an m2×n matrix, both with orthonormal rows. Set
L1 = {x ∈ Rn|Ax = a}, and L2 = {x ∈ Rn|Bx = b}.
Theorem 2.6. Let p1, p2 be arbitrary elements of the two flats L1 and L2 respectively
and let C be any matrix with orthonormal rows such that Row(C) = Row(A) ∩Row(B). Then,
(2.5) dist(L1,L2) = ‖C(A>a−B>b)‖ = ‖C(p1 − p2)‖.
2
We wish to show that (2.4) and (2.5) are equivalent. We have already seen that
dist(L1,L2) = dist(p1 − p2, L1 + L2).
8
It now follows that
dist(L1,L2) = ‖(p1 − p2)− projL1+L2(p1 − p2)‖
= ‖(p1 − p2)− projNull(A)+Null(B)(p1 − p2)‖
= ‖ projRow(A)∩Row(B)(p1 − p2)‖
= ‖ projRow(C)(p1 − p2)‖
= ‖C>C(p1 − p2)‖
= ‖C(p1 − p2)‖
since C> has orthonormal columns. We find then that (2.4) and (2.5) are essentially
equivalent, however, (2.4) does not require us to find an orthonormal basis.
9
Chapter 3
The Largest Ball in a Box
Problem
We will now address the problem of finding the radius of the largest k-dimensional
ball that can be contained in an n-dimensional box. Note that in this chapter, we
will simply say n-box and k-ball, and drop the word dimensional.
3.1 Notation and Basic Ideas
We will begin with some notation that will be used throughout this chapter. First,
let B = {x ∈ Rn| − aj ≤ xj ≤ aj , j = 1 . . . n} be the box under consideration.
We have from the definition that B is a convex and symmetric box of dimensions
2a1 × 2a2 × . . .× 2an. There will be no loss of generality if we assume
(3.1) 0 < a1 ≤ a2 ≤ . . . ≤ an.
The 2n faces of B lie in hyperplanes which we will denote by H±j = {x ∈ Rn|xj =
±aj}. By the symmetry of B, we only need to consider H+j , which we will simply
denote as Hj . We will refer to the face of B lying in Hj as the jth face of B. Bwill denote any euclidian k-ball contained in B, and by M, we mean the k-flat on
which B lies. We define Mj =M∩Hj to be the intersection of the k-flat with the
hyperplane containing the jth face of B. Finally, by r(k, n) we mean the largest
possible radius B can have. So our goal then is to find equations for r(k, n) and
10
M in terms of a1, . . . , an. To simplify notation we set δj =√a2
1 + . . .+ a2j , with
1 ≤ j ≤ n, and δ0 = 0. We include here a couple of lemmas.
Lemma 3.1. If B ⊂ B and has radius r, then there is a ball also having radius r
with its center at the origin.
Proof. Let B(b, r) ⊂ B be any ball with radius r centered at b. By the sym-
metry of B, the ball B(−b, r) is included in B. From the convexity of B,B =(12B(b, r) +
12B(−b, r)
)⊂ B. We will show B = B(0, r). Let z ∈ B. Then there
exists an x1 ∈ B(b, r), and an x2 ∈ B(−b, r) such that ‖x1 − b‖ ≤ r, ‖x2 + b‖ ≤ r,
with z = x1+x22 . Then,
‖z‖ = ‖x1+x22 ‖ =
(12
)‖x1 − b + x2 + b‖ ≤
(12
)(‖x1 − b‖+ ‖x2 + b‖) ≤ r
and z ∈ B(0, r). On the other hand, suppose z ∈ B(0, r) Set x1 = z + b ∈ B(b, r),
x2 = z − b ∈ B(−b, r). Then, x1+x22 = z+b+z−b
2 = z and we have that z ∈ B.
By lemma 3.1, we may assume thatM is a linear subspace, which simplifies our
work. In what follows, we will find the distance from the origin toMj which will be
denoted by dj . This will be achieved by first finding d′j , the distance from the origin
to p +M⊥j where p is any point in Mj . We can then find dj using Pythagoras’
theorem. We will show that the ball B of maximal radius r(k, n) does not lie in a
k-flat parallel to any of the faces of B, ensuring that Mj is non-empty for every
j = 1 . . . , n. The maximal radius, r(k, n), that any k-ball can have on the given
k-flatM will then be given by the minimum of the resulting dj ’s. Finally, e1, . . . , en
is the standard basis of Rn; the jth entry of ej is 1, while all other entries are 0.
We state here some inequalities involving the aj ’s.
Lemma 3.2. Let k ≤ n− 1 and j ≤ n. Then
aj ≤ dj for j = 1, . . . , n and Mj 6= ∅.(3.2)
aj ≤δj−1√k − 1
if and only if aj ≤δj√k
for k ≥ 2 and j ≥ 2.(3.3)
If aj ≤δj−1√k − 1
then aj−1 ≤δj−2√k − 2
for k ≥ 3 and j ≥ 3.(3.4)
Proof. (3.2) follows from the fact that Mj ⊂ Hj and that the distance from the
origin to Hj is clearly aj . For (3.3), with k, j ≥ 2 we have
11
a2j ≤
δ2jk
=δ2j−1
k+a2j
k
which is equivalent to
(k − 1)a2j
k≤δ2j−1
k
or more simply
a2j ≤
δ2j−1
k − 1.
For (3.4), with k, j ≥ 3 we have using (3.1)
a2j−1 ≤ a2
j ≤δ2j−1
k − 1.
It follows that
a2j−1 ≤
δ2j−2
k − 1+a2j−1
k − 1
so that(k − 2)a2
j−1
k − 1≤
δ2j−2
k − 1
from which it follows
a2j−1 ≤
δ2j−2
k − 2.
Before we begin our discussion on the largest (n−1)-ball in an n-box, we mention
the trivial case of finding the radius of the largest n-ball in an n-box. In order for
such a ball to be contained in a box of the same dimension, the diameter of the
ball cannot exceed the smallest measurement of the box. In terms of the above
definitions, we find that the diameter cannot exceed 2a1, or that the radius cannot
exceed a1. Thus, r(n, n) = a1 for every n. Having dealt with this case, we now add
the restriction that n ≥ 2 and 1 ≤ k ≤ n− 1.
12
3.2 The (n− 1, n) Problem
Here, we will find the radius of the largest (n − 1)-ball contained in an n-box, as
well as the equations of the planes containing a ball of such radius. We consider an
(n− 1)-flat M = {x ∈ Rn : 〈m,x〉 = 0} where m = (m1, . . . ,mn) ∈ Rn and m 6= 0.
Without loss of generality, we can assume m is a unit vector. The simplest planes
to consider are where m = ej , j = 1, . . . , n. Here, M is parallel to Hj and the
maximal radius a ball can have in this case is min{a1, . . . , aj−1, aj+1, . . . , an}. This
minimum is a2 if j = 1, and is a1 otherwise. We now only consider (n− 1)-flats Mwhere m 6= ej , j = 1, . . . , n, or equivalently, any (n− 1)-flat M such that for every
j = 1, . . . , n,Mj 6= ∅.
3.2.1 A Formula for dj
In this section, we will state a formula for dj , j = 1, . . . , n when k = n− 1. We will
also give an equation that will prove useful when deciding if a given (n− 1)-flat Mcontains a ball of maximal radius in B.
Lemma 3.3. The distance, dj, j = 1, . . . n, from the origin to the (n− 2)-flat Mj
is given by
(3.5) d2j =
a2j
1− (mj)2.
Proof. Let p ∈Mj . Then p ∈M from which it follows that 〈p,m〉 = 0, and p ∈ Hjwhich gives us 〈p, ej〉 = aj . Furthermore, p can be written as ajv for some suitable
13
v. We then have the distance, d′j , from the origin to p+M⊥j as
(d′j)2 =
Gram(p,m, ej)Gram(m, ej)
=
∣∣∣∣∣∣∣∣a2j‖v‖2 0 aj
0 1 mj
aj mj 1
∣∣∣∣∣∣∣∣∣∣∣∣∣∣ 1 mj
mj 1
∣∣∣∣∣∣=a2j‖v‖2(1− (mj)2)− a2
j
1− (mj)2.
Then,
d2j = ‖p‖2 − (d′j)
2
= a2j‖v‖2 −
a2j‖v‖2(1− (mj)2)− a2
j
1− (mj)2
=a2j‖v‖2(1− (mj)2)− a2
j‖v‖2(1− (mj)2) + a2j
1− (mj)2
=a2j
1− (mj)2.
Lemma 3.4. For any (n− 1)-flat M not parallel to Hj, for j = 1, . . . , n, we have
the following equality:
(3.6)n∑j=1
a2j
d2j
= n− 1.
Proof.n∑j=1
a2j
d2j
=n∑j=1
(1− (mj)2) = n−n∑j=1
(mj)2 = n− 1.
3.2.2 Finding r(n− 1, n) and M
Theorem 3.1. Let B′ = {x ∈ Rn′ |−aj ≤ xj ≤ aj , j = 1 . . . n′} with 0 < a1 ≤ . . . ≤an′ be a box of dimensions 2a1× . . .×2an′. The following statements are equivalent:
14
1. The (n′ − 1)-ball of maximal radius is tangent to all faces of B′,
2. an′ ≤δn′−1√n′ − 2
.
Proof. (1) ⇒ (2): Suppose there is an (n′ − 1)-ball of maximal radius tangent to
all the faces of B′. Then d1 = . . . = dn′ = t, and (3.6) gives t = δn′√n′−1
. (3.2) now
ensures us that an′ ≤δn′√n′−1
, and (3.3) shows an′ ≤δn′−1√n′−2
.
(2)⇒ (1): Assume that an′ ≤δn′−1√n′−2
. Then (3.3) gives an′ ≤δn′√n′−1
, or equivalently,
(n′ − 1)a2n′ ≤ δ2n′ . Let M be the (n′ − 1)-flat such that
(3.7) mj =
√δ2n′ − (n′ − 1)a2
j
δn′, j = 1, . . . , n′.
Defining the mj ’s in this fashion, (3.5) gives
d2j =
a2j
1− (mj)2
= a2j
(1−
δ2n′ − (n′ − 1)a2j
δ2n′
)−1
= a2j
((n′ − 1)a2
j
δ2n′
)−1
=δ2n′
n′ − 1
for every j = 1, . . . , n′. The maximality follows from (3.6): If we want to increase
the radius, we need to increase all the dj ’s. But an increase in one, say dj1 , requires
a decrease in another, say dj2 , where j1 6= j2, which would only result in a ball of
strictly smaller radius. It now remains only to show that the point of Mj closest
to the origin lay in the jth face of B′. We will argue by contradiction. Suppose for
some j = 1, . . . , n′, the closest point ofMj to the origin, say pj , is not contained in
B′. Construct a line l from the origin to pj . Since 0 ∈ B′, and l ⊂M, there exists an
i = 1, . . . , n′, i 6= j, such that l ∩Mi = {pi}. It follows that di ≤ ‖pi‖ < ‖pj‖ = dj .
Thus, we have that di 6= dj , clearly contradicting their equality. Thus we have that
the (n′ − 1)-flat M given by the equation
m1x1 + . . .+mn′xn′
= 0
where the mj ’s are defined as in (3.7), contains an (n′ − 1)-ball of maximal radius
15
that is tangent to all faces of B′.
Theorem 3.2. Let s be the smallest integer satisfying
(3.8) an−s ≤δn−s−1√n− s− 2
.
Then the maximal radius of B ⊂ B occurs when d1 = . . . = dn−s = t with the radius
being their common value. That is,
(3.9) r(n− 1, n) = t =δn−s√n− s− 1
.
Furthermore, mn−s+1 = . . . = mn = 0.
Proof. If s = 0 is the smallest integer satisfying (3.8) then an ≤ δn−1√n−2
and from
theorem 3.1 (with n′ = n) we have
d1 = . . . = dn = t =δn√n− 1
which gives us (3.9).
Now suppose s = 1 is the smallest integer satisfying (3.8). Then an >δn−1√n−2
and
an−1 ≤ δn−2√n−3
. Intersecting B with the hyperplane xn = 0 yields an (n − 1)-box
of dimensions 2a1 × . . .× 2an−1. Intersecting any (n− 1)-ball contained in B (not
parallel to any face of B) with the same hyperplane gives an (n−2) ball of the same
radius contained in the (n− 1)-box. This implies that r(n− 2, n− 1) ≥ r(n− 1, n).
We want to show that r(n − 2, n − 1) = r(n − 1, n). Applying theorem 3.1 with
n′ = n − 1 we find d1 = . . . = dn−1 = t where t = δn−1√n−2
= r(n − 2, n − 1) with the
(n− 2)-ball being tangent to all faces of the (n− 1)-box. Using the mj ’s from (3.7),
and setting mn = 0, we can extend the (n− 2)-flat containing the (n− 2)-ball to an
(n − 1)-flat M containing an (n − 1)-ball of radius δn−1√n−2
. The ball lies in B since
an >δn−1√n−2
. It is maximal in B because r(n − 1, n) cannot exceed r(n − 2, n − 1).
This (n− 1)-flat M is unique (up to the symmetry of B) as guaranteed by (3.6).
Now suppose s = 2 is the smallest integer satisfying (3.8). Then an−1 >δn−2√n−3
and an−2 ≤ δn−3√n−4
. Intersecting B with the hyperplanes xn = 0 and xn−1 = 0 gives
us an (n − 2)-box of dimensions 2a1 × . . . × 2an−2. Intersecting any (n − 1)-ball
contained in B (again, not parallel to any face of B) with the same hyperplanes
yields an (n−3)-ball of equal radius which is contained in the (n−2)-box. It follows
16
that r(n − 3, n − 2) ≥ r(n − 1, n). Applying theorem 3.1 with n′ = n − 2 we see
d1 = . . . = dn−2 = t with t = δn−2√n−3
= r(n−3, n−2) with the (n−3)-ball tangent to
all faces of the (n− 2)-box. Using the mj ’s from (3.7), and setting mn−1 = mn = 0,
we extend the (n−3)-flat containing the (n−3)-ball to an (n−1)-flatM containing
an (n− 1)-ball of radius δn−2√n−3
. This ball lies in B since an ≥ an−1 >δn−2√n−3
and it is
maximal as r(n−1, n) cannot be greater than r(n−3, n−2). (3.6) again guarantees
that this (n− 1)-flat M is unique up to the symmetries of B.
We continue in this fashion until an s is found so that (3.8) is satisfied, which
must happen since it is clearly true for s = n− 2.
We mentioned at the beginning of this section that if M was parallel to Hj for
j = 1, . . . , n, that the largest radius any ball B lying on M could have was a2. We
now need to show that (3.9) is always strictly greater than a2.
[r(n− 1, n)]2 =δ2n−s
n− s− 1=∑n−s
i=1 a2i
n− s− 1
≥ a21 + (n− s− 1)a2
2
n− s− 1=
a21
n− s− 1+ a2
2
> a22
so that r(n− 1, n) > a2. Thus (3.9) is that largest possible radius any (n− 1)-ball
B can have in B.
Corollary 3.1. The (n− 1)-flat M containing the largest (n− 1)-ball in B has as
its normal vector
(3.10) m =
√δ2n−s − (n− s− 1)a2
1
δn−s...√
δ2n−s − (n− s− 1)a2n−s
δn−s
0...
0
where s is as in theorem 3.2
Proof. From theorem 3.2, mn−s+1, . . . ,mn = 0. From theorem 3.1, with n′ = n− s,
17
we have for j = 1, . . . , n− s
(mj) =
√δ2n−s − (n− s− 1)a2
j
δn−s.
3.2.3 Contact Points
In this section, we assume that B is an (n − 1)-ball of maximal radius in the n-
box B with radius r(n − 1, n) = δn−s√n−s−1
where s is as in theorem 3.2. Let M be
the (n − 1)-flat that contains B with normal vector m as defined in corollary 3.1.
Theorem 3.2 gives dn−s+1 = an−s+1 and an−s+1 >δn−s√n−s−1
= r(n − 1, n) so that
B∩Mn−s+1 = ∅. Also, since di = ai for i = n− s+ 2, . . . , n, and since the ai’s form
an increasing sequence, we have that B ∩Mi = ∅ for i = n− s+ 2, . . . , n. Theorem
3.1 guarantees that B is tangent to the first n− s faces of B so that B ∩Mi = {ci}for i = 1, . . . , n − s. We will write these ci’s as (c1i , . . . , c
ni ). We have immediately
that ‖ci‖ = r(n− 1, n).
Theorem 3.3. For i = 1, . . . , n− s and j = 1, . . . , n,
(3.11) cji =
−
mimjδ2n−s(n− s− 1)ai
if j = 1, . . . , n− s and j 6= i,
ai if j = i,
0 if j = n− s+ 1, . . . , n.
Proof. We will examine cn−s, the contact point B makes with the (n− s) face of B.
Since Mn−s =M∩Hn−s, Any point x ∈Mn−s must satisfy
n−s∑j=1
xjmj = 0,(3.12)
xn−s = an−s.(3.13)
It follows then that cn−sn−s = an−s and that the points pl = (p1l , . . . , p
nl ) for l =
18
1, . . . , n− 1 given by
pjl =
−an−sm
n−s
mjif j = l,
an−s if j = n− s,0 otherwise
when l = 1, . . . , n− s− 1 and by
pjl =
−an−smn−s
m1if j = 1,
an−s if j = n− s,1 if j = l + 1,
0 otherwise
when l = n − s, . . . , n − 1 all belong to Mn−s since they satisfy conditions (3.12)
and (3.13). We can then construct n − 2 vectors vl = p1pl where l = 2, . . . , n − 1.
In terms of coordinates, we have
vjl =
an−sm
n−s
m1if j = 1,
−an−smn−s
mlif j = l,
0 otherwise
when l = 2, . . . , n− s− 1 and by
vjl =
1 if j = l + 1,
0 otherwise
when l = n− s, . . . , n− 1. The (n− 2) vl’s form a linearly independent set and thus
a basis for Mn−s since
dim(Mn−s) = dim(M) + dim(Hn−s)− dim(M+Hn−s)
= (n− 1) + (n− 1)− n
= n− 2.
Now, since cn−s is the point of tangency of B with Mn−s, we must have that
19
〈vl, cn−s〉 = 0 for l = 2, . . . , n− 1. First, for l = n− s, . . . , n− 1,
〈vl, cn−s〉 = cl+1n−s = 0
so that cn−s+1n−s = . . . = cnn−s = 0. Then, for l = 2, . . . , n− s− 1,
〈vl, cn−s〉 =c1n−san−sm
n−s
m1−cln−san−sm
n−s
ml= 0
or equivalently
(3.14) ml =cln−sm
1
c1n−s.
(3.12) can then be used to obtain
n−s−1∑l=1
mlcln−s = −an−smn−s
which, using (3.14) becomes
−an−smn−s =m1
c1n−s
n−s−1∑l=1
(cln−s)2
=m1
c1n−s
(δ2n−s
n− s− 1− a2
n−s
)=
m1
c1n−s
(δ2n−s − (n− s− 1)a2
n−sn− s− 1
)=m1(mn−s)2δ2n−s(n− s− 1)c1n−s
.
Thus,
c1n−s = −m1mn−sδ2n−s
(n− s− 1)an−s.
20
Now we can use (3.14) to find the c2n−s, . . . , cn−s−1n−s . For l = 2, . . . , n− s− 1,
cln−s =mlc1n−sm1
= −mlm1mn−sδ2n−s
m1(n− s− 1)an−s
= −mlmn−sδ2n−s
(n− s− 1)an−s.
A similar development follows for c1, . . . , cn−s−1.
3.2.4 The (2, 3) Problem
In this short section, we state the results of the previous section for the case k = 2
and n = 3. The contact points listed below are correct for the planes listed just
before them. If one of the other planes are used, the contact points should be
adjusted appropriately. We have that
r(2, 3) =
√a2
1 + a22 + a2
3
2if a3 ≤
√a2
1 + a22,
√a2
1 + a22 if a3 >
√a2
1 + a22
and M = {x ∈ R3|〈m,x〉 = 0} where
m =
√−a2
1 + a22 + a2
3
a21 + a2
2 + a23
√a2
1 − a22 + a2
3
a21 + a2
2 + a23
√a2
1 + a22 − a2
3
a21 + a2
2 + a23
21
if a3 ≤√a2
1 + a22. If however, a3 >
√a2
1 + a22, then we find
m =
a2√a2
1 + a22
a1√a2
1 + a22
0
.
The contact points are given as follows:
c1 =
(a1,−
√−a2
1 + a22 + a2
3
√a2
1 − a22 + a2
3
2a1,−√−a2
1 + a22 + a2
3
√a2
1 + a22 − a2
3
2a1
)
c2 =
(−√a2
1 − a22 + a2
3
√−a2
1 + a22 + a2
3
2a2, a2,−
√a2
1 − a22 + a2
3
√a2
1 + a22 − a2
3
2a2
)
c3 =
(−√a2
1 + a22 − a2
3
√−a2
1 + a22 + a2
3
2a3,−√a2
1 + a22 − a2
3
√a2
1 − a22 + a2
3
2a3, a3
)
for a3 ≤√a2
1 + a22, and they are given as
c1 = (a1,−a2, 0)
c2 = (−a1, a2, 0)
when a3 >√a2
1 + a22. Figure 3.1 will help to illustrate the (2, 3) case.
3.3 The (k, n) Problem
In this section, we will generalize the results of section 3.2 to find the radius of the
largest k-ball contained in an n-box. Let M be a k-flat which contains a k-ball
B ⊂ B. Let m1, . . . ,mn−k be the n− k normal vectors to M which satisfy
(3.15) 〈mi,mj〉 =
1, if i = j,
0, if i 6= j.
We first consider the case of when a k-flatM (and thus the k-ball B) is parallel
to some of the Hj ’s. If M is parallel to Hj for some j = 1, . . . , n, then M ⊂−→Hj ,
22
Figure 3.1:The largest 2-ball in a 3-box for(1) : a3 <
√a2
1 + a22, (2) : a3 =
√a2
1 + a22, (3) : a3 >
√a2
1 + a22
with cross sections in the xy-plane.
or equivalently,−→H⊥j ⊂ M⊥. Since
−→H⊥j = span{ej}, it follows that ej ∈ M⊥, and
since dim(M⊥) = n − k, we have that M can be parallel to at most n − k of the
Hj ’s. For a given f = 1, . . . , n − k, there are(nf
)distinct k-flats M that are lying
parallel to exactly f of the Hj ’s. We can view any k-ball B lying on such a k-flat
M as being contained in an (n − f)-box of dimensions 2a1 × . . . × 2an−f where
0 < a1 ≤ . . . ≤ an−f are the aj ’s associated with B such that M is not parallel
to Hj . Thus, we have(nf
)such boxes to consider. Our work is reduced when we
realize that we only have to consider the largest of these (n− f)-boxes, since if any
k-ball of radius r is contained in one of the smaller (n − f)-boxes, then a k-ball
of the same radius can also be contained in the largest (n − f)-box since each of
its measurements is no smaller then the corresponding measurements of the smaller
(n − f)-boxes. Since the aj ’s are increasing, we only need to consider the k-ball Blying on the k-flat M that is parallel to the first f of the Hj ’s. We can view this
23
ball as being contained in an (n − f)-box of dimensions 2af+1 × . . . × 2an. The
important thing here to note is that M is not parallel to any of the faces of the
(n − f)-box. Since we have not developed a formula for the maximal radius of a
k-ball in an (n − f)-box yet, we will simply denote it as r(k, n − f). We will say
more about this at the end of this section, but for now we will assume that M is
not parallel to Hj for j = 1, . . . , n, or equivalently, that none of theMj ’s are empty.
3.3.1 A Formula for dj
We will now develop a formula for dj , j = 1, . . . , n. Since M is not parallel to any
Hj , it follows that ej 6∈ span{m1, . . . ,mn−k}. Let p ∈ Mj . Then p ∈ M so that
〈p,mi〉 = 0 for i = 1, . . . , n− k, and p ∈ Hj so 〈p, ej〉 = aj . Again, we can write p
as ajv for a suitable v. The distance, d′j , from the origin to the the (n− k + 1)-flat
p+M⊥j is given by
(3.16) (d′j)2 =
Gram(p,m1, . . . ,mn−k, ej)Gram(m1, . . . ,mn−k, ej)
.
The following two lemmas give formulas for evaluating (3.16), the first deals with
the denominator, the second with the numerator.
Lemma 3.5. Gram(m1, . . . ,mn, ej) = 1−n∑i=1
(mji )
2.
Proof. We will use induction on n. For n = 1,
Gram(m1, ej) =
∣∣∣∣∣∣ 1 mj1
mj1 1
∣∣∣∣∣∣ = 1− (mj1)2.
Now, assume Gram(m1, . . . ,mn, ej) = 1−n∑i=1
(mji )
2.
Gram(m1, . . . ,mn+1, ej) =
∣∣∣∣∣∣∣∣∣∣∣∣∣∣
1 0 . . . 0 mj1
0 1 . . . 0 mj2
......
. . ....
...
0 0 . . . 1 mjn+1
mj1 mj
2 . . . mjn+1 1
∣∣∣∣∣∣∣∣∣∣∣∣∣∣
24
=
∣∣∣∣∣∣∣∣∣∣∣
1 . . . 0 mj2
.... . .
......
0 . . . 1 mjn+1
mj2 . . . mj
n+1 1
∣∣∣∣∣∣∣∣∣∣∣+ (−1)n+1mj
1
∣∣∣∣∣∣∣∣∣∣∣
0 . . . 0 mj1
1 . . . 0 mj2
.... . .
......
0 . . . 1 mjn+1
∣∣∣∣∣∣∣∣∣∣∣= 1−
n+1∑i=2
(mji )
2 + (−1)n+1(−1)n(mj1)2
= 1−n+1∑i=1
(mji )
2.
Lemma 3.6. Gram(p,m1, . . . ,mn, ej) = a2j‖v‖2
(1−
n∑i=1
(mji )
2
)− a2
j .
Proof.
Gram(p,m1, . . . ,mn, ej) =
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
a2j‖v‖2 0 0 . . . 0 aj
0 1 0 . . . 0 mj1
0 0 1 . . . 0 mj2
......
.... . .
......
0 0 0 . . . 1 mjn
aj mj1 mj
2 . . . mjn 1
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
= a2j‖v‖2 Gram(m1, . . . ,mn, ej) + (−1)n+1aj
∣∣∣∣∣∣∣∣∣∣∣∣∣∣
0 0 . . . 0 aj
1 0 . . . 0 mj1
0 1 . . . 0 mj2
......
. . ....
...
0 0 . . . 1 mjn
∣∣∣∣∣∣∣∣∣∣∣∣∣∣= a2
j‖v‖2(
1−n∑i=1
(mji )
2
)+ (−1)n+1(−1)na2
j
= a2j‖v‖2
(1−
n∑i=1
(mji )
2
)− a2
j .
25
We can now use the results of the previous two lemmas and (3.16) to find a
formula for dj .
Lemma 3.7. For j = 1, . . . , n,
(3.17) d2j =
a2j
1−∑n−k
i=1 (mji )2
.
Proof.
d2j = ‖p‖2 − (d′j)
2
= a2j‖v‖2 −
Gram(p,m1, . . . ,mn−k, ej)Gram(m1, . . . ,mn−k, ej)
=a2j‖v‖2
(1−
∑n−ki=1 (mj
i )2)− a2
j‖v‖2(
1−∑n−k
i=1 (mji )
2)
+ a2j
1−∑n−k
i=1 (mji )2
=a2j
1−∑n−k
i=1 (mji )2
.
Lemma 3.8. For any k-flat M not parallel to Hj for j = 1, . . . , n, we have
(3.18)n∑j=1
a2j
d2j
= k.
Proof.
n∑j=1
a2j
d2j
=n∑j=1
a2j
(1−
∑n−ki=1 (mj
i )2)
a2j
=n∑j=1
1−n∑j=1
n−k∑i=1
(mji )
2
= n−n−k∑i=1
n∑j=1
(mji )
2 = n−n−k∑i=1
(1) = n− (n− k) = k.
26
3.3.2 Choosing an Orthonormal Basis
Let 1 ≤ k′ < n′. In the next section, we will be interested in finding a set of n′ − k′
orthonormal vectors m1, . . . ,mn′−k′ ∈ Rn′ such that for each j = 1, . . . , n′, we have
n′−k′∑i=1
(mji )
2 =δ2n′ − k′a2
j
δ2n′(3.19)
where 0 < a1 ≤ a2 ≤ . . . ≤ an′ and an′ ≤δn′−1√k′ − 1
. In this section, we find such a
set of n′−k′ vectors. We will first consider an n′× (n′−1) matrix with orthonormal
columns that will be useful in defining m1, . . . ,mn′−k′ . In what follows, ci denotes
cos(θi) and si denotes sin(θi), for i = 1, . . . , n′ − 1 and for each i, 0 ≤ θi < 2π.
Lemma 3.9. The n′ × (n′ − 1) matrix
(3.20)
R =
c1 0 . . . 0 0
−s1s2 c2 . . . 0 0
−s1c2s3 −s2s3 . . . 0 0
−s1c2c3s4 −s2c3s4 . . . 0 0...
.... . .
......
−s1c2 · · · cn′−3sn′−2 −s2c3 · · · cn′−3sn′−2 . . . cn′−2 0
−s1c2 · · · cn′−2sn′−1 −s2c3 · · · cn′−2sn′−1 . . . −sn′−2sn′−1 cn′−1
−s1c2 · · · cn′−2cn′−1 −s2c3 · · · cn′−2cn′−1 . . . −sn′−2cn′−1 −sn′−1
has orthonormal columns.
Proof. Let ri denote the ith column of R. We will first show that the ri’s have unit
length. Of course, this is clear for rn′−1, so we assume 1 ≤ i ≤ n′ − 2.
‖ri‖2 = c2i + s2i
n′−1∑p=i+1
(p−1∏l=i+1
c2l
)s2p + s2i
n′−1∏l=i+1
c2l
= c2i + s2i
n′−1∑p=i+1
(s2p
p−1∏l=i+1
c2l
)+
n′−1∏l=i+1
c2l
.
27
Pulling the last term out of the sum and the second product, we have
‖ri‖2 = c2i + s2i
n′−2∑p=i+1
(s2p
p−1∏l=i+1
c2l
)+ s2n′−1
n′−2∏l=i+1
c2l + c2n′−1
n′−2∏l=i+1
c2l
,
and since c2i + s2i = 1 for i = 1, . . . , n′ − 1,
‖ri‖2 = c2i + s2i
n′−2∑p=i+1
(s2p
p−1∏l=i+1
c2l
)+
n′−2∏l=i+1
c2l
.
We continue in this fashion until we get
‖ri‖2 = c2i + s2i
i+2∑p=i+1
(s2p
p−1∏l=i+1
c2l
)+
i+2∏l=i+1
c2l
.
Now we use the convention that for any function f and any integers i,m with m < i,
the empty product∏mj=i f(j) = 1. With this in mind, we have
‖ri‖2 = c2i + s2i(s2i+1 + s2i+2c
2i+1 + c2i+1c
2i+2
)= c2i + s2i
(s2i+1 + c2i+1
)= c2i + s2i
= 1,
and so each column of R has unit length. For orthogonality, we first assume 1 ≤i1 < i2 ≤ n′ − 2. We have
〈ri1 , ri2〉 = si1si2
−ci2 i2−1∏l=i1+1
cl +n′−1∑p=i2+1
s2p p−1∏l1=i1+1
cl1
p−1∏l2=i2+1
cl2
+
+n′−1∏
l1=i1+1
cl1
n′−1∏l2=i2+1
cl2
.
28
By pulling out the last term of the sum and each of the last two products, we get
〈ri1 , ri2〉 = si1si2
−ci2 i2−1∏l=i1+1
cl +n′−2∑p=i2+1
s2p p−1∏l1=i1+1
cl1
p−1∏l2=i2+1
cl2
+
+ s2n′−1
n′−2∏l1=i1+1
cl1
n′−2∏l2=i2+1
cl2 + c2n′−1
n′−2∏l1=i1+1
cl1
n′−2∏l2=i2+1
cl2
.
Again, since c2i + s2i = 1 for i = 1, . . . , n′ − 1,
〈ri1 , ri2〉 = si1si2
−ci2 i2−1∏l=i1+1
cl +n′−2∑p=i2+1
s2p p−1∏l1=i1+1
cl1
p−1∏l2=i2+1
cl2
+
+n′−2∏
l1=i1+1
cl1
n′−2∏l2=i2+1
cl2
.
We continue to pull the last term out of the sum and each of the last two products,
until we arrive at
〈ri1 , ri2〉 = si1si2
−ci2 i2−1∏l=i1+1
cl +i2+2∑p=i2+1
s2p p−1∏l1=i1+1
cl1
p−1∏l2=i2+1
cl2
+
+i2+2∏
l1=i1+1
cl1
i2+2∏l2=i2+1
cl2
.
Now, (recalling empty products are defined to be 1),
i2+2∑p=i2+1
s2p p−1∏l1=i1+1
cl1
p−1∏l2=i2+1
cl2
= s2i2+1
i2∏l=i1+1
cl + s2i2+2
i2+1∏l=i1+1
cl
ci2+1,
and
i2+2∏l1=i1+1
cl1
i2+2∏l2=i2+1
cl2 = c2i2+2
i2+1∏l=i1+1
cl
ci2+1,
29
so that
〈ri1 , ri2〉 = si1si2
−ci2 i2−1∏l=i1+1
cl + s2i2+1
i2∏l=i1+1
cl + s2i2+2
i2+1∏l=i1+1
cl
ci2+1+
+ c2i2+2
i2+1∏l=i1+1
cl
ci2+1
= si1si2
−ci2 i2−1∏l=i1+1
cl + s2i2+1
i2∏l=i1+1
cl + ci2+1
i2+1∏l=i1+1
cl
= si1si2
−ci2 i2−1∏l=i1+1
cl + s2i2+1
i2∏l=i1+1
cl + c2i2+1
i2∏l=i1+1
cl
= si1si2
−ci2 i2−1∏l=i1+1
cl +i2∏
l=i1+1
cl
= si1si2
−ci2 i2−1∏l=i1+1
cl + ci2
i2−1∏l=i1+1
cl
= 0.
Finally, for 1 ≤ i ≤ n′ − 2 we have (using empty products as necessary)
〈ri, rn′−1〉 = −si
(n′−2∏l=i+1
cl
)sn′−1cn′−1 + si
(n′−1∏l=i+1
cl
)cn′−1sn′−1
= 0,
and the n′ − 1 columns of R form an orthonormal set.
Our goal now is to decide which of the n′ − k′ columns of R to use as the mi’s,
and which values of c1, s1 . . . , cn′−1, sn′−1 will satisfy (3.19) for j = 1, . . . , n′. The
next three lemmas will be useful in deciding which columns of R to use. They will
be followed by a theorem that will give values for c1, s1, . . . , cn′−1, sn′−1. First, for
i = 1, . . . , n′ − k′ + 1, consider the following n′ − k′ + 1 sequences:
(3.21){
(j − i+ 1)δ2n′ − k′δ2j}k′+i−1
j=i.
Lemma 3.10. For every i = 1, . . . , n′ − k′ + 1, there is a j = i, . . . , k′ + i− 1 such
30
that (j − i + 1)δ2n′ − k′δ2j ≥ 0. Let ti be the smallest such j for each i. Then the
sequence {ti}n′−k′+1i=1 is strictly increasing.
Proof. To show the existence of such a j for each i = 1, . . . , n′ − k′ + 1, set j =
k′ + i − 1. We have that the k′ + i − 1 term of each sequence is k′δ2n′ − k′δ2k′+i−1
which is nonnegative since i ≤ n′ − k′ + 1 implies k′ + i − 1 ≤ n′. Now, for
i = 1, . . . , n′ − k′, let ti be the smallest j such that (j − i + 1)δ2n′ − k′δ2j ≥ 0.
Then, for j = i, . . . , ti − 1, (j − i + 1)δ2n′ − k′δ2j < 0. From this, it follows that
(j − i)δ2n′ − k′δ2j < 0 for j = i + 1, . . . , ti. Thus, the j = ti + 1 term of the i + 1
sequence {(j − i)δ2n′ − k′δ2j }k′+ij=i+1 is the first term that can be non-negative so that
ti+1 > ti. This shows that {ti}n′−k′+1i=1 is a strictly increasing sequence.
Lemma 3.11. Let k′ ≥ 2 and i = 1, . . . , n′ − k′ + 1. If (j − i + 1)δ2n′ − k′δ2j >(j − i)δ2n′ − k′δ2j−1 for j ≥ i+ 1, then (j − i)δ2n′ − k′δ2j−1 > (j − i− 1)δ2n′ − k′δ2j−2.
Proof. For i = 1, . . . , n′ − k′ + 1 and j ≥ i+ 1. We have
(j − i+ 1)δ2n′ − k′δ2j > (j − i)δ2n′ − k′δ2j−1
(j − i+ 1− (j − i))δ2n′ > k′(δ2j − δ2j−1)
δ2n′ > k′a2j .
But since aj ≥ aj−1,
δ2n′ > k′a2j−1
(j − i− (j − i− 1))δ2n′ > k′(δ2j−1 − δ2j−2)
(j − i)δn′ − k′δ2j−1 > (j − i− 1)δ2n′ − k′δ2j−2.
Lemma 3.12. For i = 1, . . . , n′ − k′ + 1, the sequence {(j − i+ 1)δ2n′ − k′δ2j }tij=i is
strictly increasing and only the j = ti term is nonnegative.
Proof. Let i = 1, . . . , n′ − k′ + 1. If ti = i, the result is trivial as the ith sequence
contains only one term, which is nonnegative. If ti > i then ti is the smallest value of
j such that (j−i+1)δ2n′−k′δ2j ≥ 0. Setting j = ti−1, we then have (j−i)δ2n′−k′δ2j−1 <
31
0 and so (j − i+ 1)δ2n′ − k′δ2j > (j − i)δ2n′ − k′δ2j−1. Repeated applications of lemma
3.11 show for j = i+1, . . . , ti, (j−i+1)δn′−k′δ2j > (j−i)δn′−k′δ2j−1 as required.
Remark 3.1. For k′ = 1, . . . , n′− 1 we have that t1 = 1 since for i = 1, the sequence
{jδ2n′ − k′δ2j }kj=1 has as its first term δ2n′ − k′δ21 = δ2n′ − k′a21 which is clearly positive.
Setting i = n′−k′+1, we find the resulting sequence {(j−n′+k′)δ2n′−k′δ2j }n′−1j=n′−k′+1
has as its last term k′δ2n′ − k′δ2n′ = 0. So tn′−k′+1 ≤ n′. Thus, the first n′ − k′ ti’seach correspond to a unique column of the matrix R.
We can now use the ti’s and the matrix R to decide which vectors to use for the
mi’s.
Theorem 3.4. Let mi be the tith column of R for i = 1, . . . , n′ − k′. Let
c2j =
(j − i+ 1)δ2n′ − k′δ2j(j − i+ 1)δ2n′ − k′δ2j−1
if j = ti,
(j − i)δ2n′ − k′δ2j(j − i− 1)δ2n′ − k′δ2j−1
if ti < j < ti+1,
0 if tn′−k′+1 < j < n′,
(3.22)
s2j =
k′a2j
(j − i+ 1)δ2n′ − k′δ2j−1
if j = ti,
k′a2j − δ2n′
(j − i− 1)δ2n′ − k′δ2j−1
if ti < j < ti+1,
1 if tn′−k′+1 < j < n′,
(3.23)
where i = 1, . . . , n′−k′+ 1 and j = 1, . . . , n′−1. Then the mi’s are an orthonormal
set and for j = 1, . . . , n′,
(3.24)n′−k′∑i=1
(mji )
2 =δ2n′ − k′a2
j
δ2n′.
Proof. First, we will show that our definitions for c2j and s2j make sense. First, for
j = ti where i = 1, . . . , n′ − k′ + 1 we have
32
c2j + s2j =(j − i+ 1)δ2n′ − k′δ2j + k′a2
j
(j − i+ 1)δ2n′ − k′δ2j−1
=(j − i+ 1)δ2n′ − k′δ2j−1
(j − i+ 1)δ2n′ − k′δ2j−1
= 1,
and for j 6= ti where i = 1, . . . , n′ − k′ we have
c2j + s2j =(j − i)δ2n′ − k′δ2j + k′a2
j − δ2n′(j − i− 1)δ2n′ − k′δ2j−1
=(j − i− 1)δ2n′ − k′δ2j−1
(j − i− 1)δ2n′ − k′δ2j−1
= 1.
Of course, c2j + s2j = 1 for j > tn′−k′+1. Now, for j = ti with i = 1, . . . , n′ − k′ + 1,
(j − i + 1)δ2n′ − k′δ2j ≥ 0 since it is the first non-negative term of the ith sequence
{(j − i + 1)δ2n′ − k′δ2j }k′+i−1j=i . Now (j − i + 1)δ2n′ − k′δ2j < (j − i + 1)δ2n′ − k′δ2j−1,
so that 0 ≤ c2j < 1 and so 0 < s2j ≤ 1. For ti < j < ti+1 with i = 1, . . . , n′ − k′,(j − i)δ2n′ − k′δ2j < 0 as it is the jth term of the sequence {(j − i)δ2n′ − k′δ2j }
k′+ij=i+1
and since lemma 3.12 guarantees (j − i− 1)δ2n′ − k′δ2j−1 < (j − i)δ2n′ − k′δ2j we find
0 < c2j < 1 and thus 0 < s2j < 1. So our definitions of c2j and s2j are well defined. The
fact that the mi’s form an orthonormal set follows from the result that the columns
of R form an orthonormal set.
To prove (3.24), we will use induction on j. For j = 1, we have that i = 1 and
ti = 1 so that
n′−k′∑i=1
(m1i )
2 = c21 =δ2n′ − k′a2
1
δ2n′.
Our inductive step actually consists of six cases, which must be proved individually.
For the first four cases, we assume 1 ≤ j ≤ n′ − 2.
Case I: j = ti and ti < j + 1 < ti+1 for some i = 1, . . . , n′ − k′.If we assume that (3.24) is satisfied for j = ti for some i = 1, . . . , n′ − k′, then we
have that
n′−k′∑i=1
(mji )
2 =ti−1∑l=1
(s2tlc
2tl+1 · · · c2j−1s
2j
)+ c2j
=δ2n′ − k′a2
j
δ2n′.
33
Then,
n′−k′∑i=1
(mj+1i )2 =
ti−1∑l=1
(s2tlc
2tl+1 · · · c2js2j+1
)+ s2js
2j+1
=
(ti−1∑l=1
(s2tlc
2tl+1 · · · c2j−1s
2j
)) c2js2j+1
s2j+ s2js
2j+1,
or more simply,
(3.25)n′−k′∑i=1
(mj+1i )2 = s2j+1
((δ2n′ − k′a2
j
δ2n′− c2j
)c2js2j
+ s2j
).
Using (3.22) and (3.23), we find
c2j =(j − i+ 1)δ2n′ − k′δ2j
(j − i+ 1)δ2n′ − k′δ2j−1
,
s2j =k′a2
j
(j − i+ 1)δ2n′ − k′δ2j−1
,
s2j+1 =k′a2
j+1 − δ2n′(j − i)δ2n′ − k′δ2j
.
We then find
δ2n′ − k′a2j
δ2n′− c2j =
δ2n′ − k′a2j
δ2n′−
(j − i+ 1)δ2n′ − k′δ2j(j − i+ 1)δ2n′ − k′δ2j−1
= 1−k′a2
j
δ2n′−
(j − i+ 1)δ2n′ − k′δ2j−1
(j − i+ 1)δ2n′ − k′δ2j−1
+k′a2
j
(j − i+ 1)δ2n′ − k′δ2j−1
=k′a2
j
(j − i+ 1)δ2n′ − k′δ2j−1
−k′a2
j
δ2n′,
34
and
c2js2j
=(j − i+ 1)δ2n′ − k′δ2j
(j − i+ 1)δ2n′ − k′δ2j−1
·(j − i+ 1)δ2n′ − k′δ2j−1
k′a2j
=(j − i+ 1)δ2n′ − k′δ2j
k′a2j
=(j − i+ 1)δ2n′ − k′δ2j−1
k′a2j
− 1,
so that(δ2n′ − k′a2
j
δ2n′− c2j
)c2js2j
= 1−k′a2
j
(j − i+ 1)δ2n′ − k′δ2j−1
−(j − i+ 1)δ2n′ − k′δ2j
δ2n′.
Adding s2j yields
(δ2n′ − k′a2
j
δ2n′− c2j
)c2js2j
+ s2j = 1−(j − i+ 1)δ2n′ − k′δ2j
δ2n′
= −(j − i)δ2n′ − k′δ2j
δ2n′,
and we find, after multiplying by s2j+1, that (3.25) becomes
n′−k′∑i=1
(mj+1i )2 = −
k′a2j+1 − δ2n′
(j − i)δ2n′ − k′δ2j·
(j − i)δ2n′ − k′δ2jδ2n′
=δ2n′ − k′a2
j+1
δ2n′.
Thus (3.24) hold for j = ti and ti < j + 1 < ti+1 for some i = 1, . . . , n′ − k′.
Case II: ti < j and j + 1 < ti+1 for some i = 1, . . . , n′ − k′.Suppose (3.24) holds for such a value of j. That is,
n′−k′∑i=1
(mji )
2 =ti∑l=1
s2tlc2tl+1 · · · c2j−1s
2j
=δ2n′ − k′a2
j
δ2n′.
35
Then
n′−k′∑i=1
(mj+1i )2 =
ti∑l=1
(s2tlc
2ti+1 · · · c2js2j+1
)=
(ti∑l=1
(s2tlc
2ti+1 · · · c2j−1s
2j
)) c2js2j+1
s2j,
and thus
(3.26)n′−k′∑i=1
(mj+1i )2 =
(δ2n′ − k′a2
j
δ2n′
)c2js
2j+1
s2j.
Using equations (3.22) and (3.23) we find
c2j =(j − i)δ2n′ − k′δ2j
(j − i− 1)δ2n′ − k′δ2j−1
,
s2j =k′a2
j − δ2n′(j − i− 1)δ2n′ − k′δ2j−1
,
s2j+1 =k′a2
j+1 − δ2n′(j − i)δ2n′ − k′δ2j
.
Then we see
c2js2j+1
s2j=
(j − i)δ2n′ − k′δ2j(j − i− 1)δ2n′ − k′δ2j−1
·k′a2
j+1 − δ2n′(j − i)δ2n′ − k′δ2j
·(j − i− 1)δ2n′ − k′δ2j−1
k′a2j − δ2n′
=k′a2
j+1 − δ2n′k′a2
j − δ2n′,
and so (3.26) becomes
n′−k′∑i=1
(mj+1i )2 =
(δ2n′ − k′a2
j
δ2n′
)k′a2
j+1 − δ2n′k′a2
j − δ2n′
=δ2n′ − k′a2
j+1
δ2n′.
And thus (3.24) hold for all j such that ti < j and j + 1 < ti+1 for some i =
1, . . . , n′ − k′.
36
Case III: ti < j < ti+1 and j + 1 = ti+1 for some i = 1, . . . , n′ − k′ − 1.
If we assume (3.24) holds for such a value of j, then we have
n′−k′∑i=1
(mji )
2 =ti∑l=1
(s2tlc
2tl+1 · · · c2j−1s
2j
)=δ2n′ − k′a2
j
δ2n′.
Then,
n′−k′∑i=1
(mj+1i )2 =
ti∑l=1
(s2tlc
2tl+1 · · · c2js2j+1
)+ c2j+1
=
(ti∑l=1
(s2tlc
2tl+1 · · · c2j−1s
2j
)) c2js2j+1
s2j+ c2j+1.
So for j + 1 = ti+1 we have
(3.27)n′−k′∑i=1
(mj+1i )2 =
(δ2n′ − k′a2
j
δ2n′
)c2js
2j+1
s2j+ c2j+1.
Using (3.22) and (3.23), we find
c2j =(j − i)δ2n′ − k′δ2j
(j − i− 1)δ2n′ − k′δ2j−1
,
s2j =k′a2
j − δ2n′(j − i− 1)δ2n′ − k′δ2j−1
,
c2j+1 =(j − i+ 1)δ2n′ − k′δ2j+1
(j − i+ 1)δ2n′ − k′δ2j,
s2j+1 =k′a2
j+1
(j − i+ 1)δ2n′ − k′δ2j,
so that
c2js2j+1
s2j=
(j − i)δ2n′ − k′δ2j(j − i− 1)δ2n′ − k′δ2j−1
·k′a2
j+1
(j − i+ 1)δ2n′ − k′δ2j·
(j − i− 1)δ2n′ − k′δ2j−1
k′a2j − δ2n′
=
((j − i)δ2n′ − k′δ2j
)(k′a2
j+1
)(
(j − i+ 1)δ2n′ − k′δ2j)(
k′a2j − δ2n′
) ,
37
and (3.27) becomes
n′−k′∑i=1
(mj+1i )2 = −
k′a2j+1
((j − i)δ2n′ − k′δ2j
)δ2n′(
(j − i+ 1)δ2n′ − k′δ2j) +
(j − i+ 1)δ2n′ − k′δ2j+1
(j − i+ 1)δ2n′ − k′δ2j
= −k′a2
j+1
((j − i+ 1)δ2n′ − k′δ2j − δ2n′
)δ2n′(
(j − i+ 1)δ2n′ − k′δ2j) +
(j − i+ 1)δ2n′ − k′δ2j+1
(j − i+ 1)δ2n′ − k′δ2j
= −k′a2
j+1
δ2n′+
k′a2j+1δ
2n′
δ2n′(
(j − i+ 1)δ2n′ − k′δ2j) +
(j − i+ 1)δ2n′ − k′δ2j+1
(j − i+ 1)δ2n′ − k′δ2j
= −k′a2
j+1
δ2n′+
(j − i+ 1)δ2n′ − k′δ2j+1 + k′a2j+1
(j − i+ 1)δ2n′ − k′δ2j
= −k′a2
j+1
δ2n′+ 1
=δ2n′ − k′a2
j+1
δ2n′.
Thus, (3.24) is satisfied when ti < j < ti+1 and j + 1 = ti+1 for some i =
1, . . . , n′ − k′ − 1.
Case IV: j = ti and j + 1 = ti+1 for some i = 1, . . . , n′ − k′ − 1.
In this case, we assume that
n′−k′∑i=1
(mji )
2 =ti−1∑l=1
(s2tlc
2tl+1 · · · c2j−1s
2j
)+ c2j
=δ2n′ − k′a2
j
δ2n′.
Then,
n′−k′∑i=1
(mj+1i )2 =
ti−1∑l=1
(s2tlc
2tl+1 · · · c2js2j+1
)+ s2js
2j+1 + c2j+1
=
(ti−1∑l=1
(s2tlc
2tl+1 · · · c2j−1s
2j
)) c2js2j+1
s2j+ s2js
2j+1 + c2j+1,
38
so for j = ti, j + 1 = ti+1 we have
(3.28)n′−k′∑i=1
(mj+1i )2 =
(δ2n′ − k′a2
j
δ2n′− c2j
)c2js
2j+1
s2j+ s2js
2j+1 + c2j+1.
Using (3.22) and (3.23) we get
c2j =(j − i+ 1)δ2n′ − k′δ2j
(j − i+ 1)δ2n′ − k′δ2j−1
,
s2j =k′a2
j
(j − i+ 1)δ2n′ − k′δ2j−1
,
c2j+1 =(j − i+ 1)δ2n′ − k′δ2j+1
(j − i+ 1)δ2n′ − k′δ2j,
s2j+1 =k′a2
j+1
(j − i+ 1)δ2n′ − k′δ2j.
We then find
c2js2j+1
s2j=
(j − i+ 1)δ2n′ − k′δ2j(j − i+ 1)δ2n′ − k′δ2j−1
·k′a2
j+1
(j − i+ 1)δ2n′ − k′δ2j·
(j − i+ 1)δ2n′ − k′δ2j−1
k′a2j
=a2j+1
a2j
.
Also,
δ2n′ − k′a2j
δ2n′− c2j = 1−
k′a2j
δ2n′−
(j − i+ 1)δ2n′ − k′δ2j(j − i+ 1)δ2n′ − k′δ2j−1
=−k′a2
j
δ2n′+
(j − i+ 1)δ2n′ − k′δ2j−1
(j − i+ 1)δ2n′ − k′δ2j−1
−(j − i+ 1)δ2n′ − k′δ2j
(j − i+ 1)δ2n′ − k′δ2j−1
= −k′a2
j
δ2n′+
k′a2j
(j − i+ 1)δ2n′ − k′δ2j−1
.
Since
c2j+1 =(j − i+ 1)δ2n′ − k′δ2j+1
(j − i+ 1)δ2n′ − k′δ2j= 1−
k′a2j+1
(j − i+ 1)δ2n′ − k′δ2j,
39
we can write s2js2j+1 + c2j+1 as
k′a2j
(j − i+ 1)δ2n′ − k′δ2j−1
·k′a2
j+1
(j − i+ 1)δ2n′ + 1− k′δ2j−
k′a2j+1
(j − i+ 1)δ2n′ − k′δ2j
= 1 +(k′)2a2
ja2j+1 − k′a2
j+1
((j − i+ 1)δ2n′ − k′δ2j−1
)(
(j − i+ 1)δ2n′ − k′δ2j−1
)((j − i+ 1)δ2n′ − k′δ2j
)= 1 + k′a2
j+1
k′a2j − (j − i+ 1)δ2n′ + k′δ2j−1(
(j − i+ 1)δ2n′ − k′δ2j−1
)((j − i+ 1)δ2n′ − k′δ2j
)= 1 + k′a2
j+1
k′δ2j − (j − i+ 1)δ2n′((j − i+ 1)δ2n′ − k′δ2j−1
)((j − i+ 1)δ2n′ − k′δ2j
)= 1−
k′a2j+1
(j − i+ 1)δ2n′ − k′δ2j−1
.
Substituting into (3.28) and simplifying yields
n′−k′∑i=1
(mj+1i )2 =
(−k′a2
j
δ2n′+
k′a2j
(j − i+ 1)δ2n′ − k′δ2j−1
)a2j+1
a2j
+ 1−k′a2
j+1
(j − i+ 1)δ2n′ − k′δ2j−1
=−k′a2
j+1
δ2n′+
k′a2j+1
(j − i+ 1)δ2n′ − k′δ2j−1
+ 1−k′a2
j+1
(j − i+ 1)δ2n′ − k′δ2j−1
= 1−k′a2
j+1
δ2n′
=δ2n′ − k′a2
j+1
δ2n′.
Thus, for j = ti and j + 1 = ti+1 for some i = 1, . . . , n′− k′− 1, we have that (3.24)
holds.
So far, we have shown that (3.24) holds for all j = 1, . . . , tn′−k′+1 − 1. We wish
to show it holds for j = tn′−k′+1, . . . , n′ as well, so we must consider an additional
two cases.
Case V: tn′−k′ < n′ − 1.
Since tn′−k′ < n′−1, it follows that tn′−k′ + 1 ≤ tn′−k′+1 ≤ n′. If tn′−k′+1 = n′, then
40
setting j = n′ − 1, we have by assumption
n′−k′∑i=1
(mn′−1i )2 =
tn′−k′∑l=1
(s2tlc
2tl+1 · · · c2n′−2s
2n′−1
)=δ2n′ − k′a2
n′−1
δ2n′.
Then,
n′−k′∑i=1
(mn′i )2 =
tn′−k′∑l=1
(s2tlc
2tl+1 · · · c2n′−2c
2n′−1
)=
tn′−k′∑l=1
(s2tlc
2tl+1 · · · c2n′−2s
2n′−1
) c2n′−1
s2n′−1
,
or more simply,
(3.29)n′−k′∑i=1
(mn′i )2 =
(δ2n′ − k′a2
n′−1
δ2n′
)c2n′−1
s2n′−1
.
From (3.22) and (3.23) we have
c2n′−1 =(k′ − 1)δ2n′ − k′δ2n′−1
(k′ − 2)δ2n′ − k′δ2n′−2
,
s2n′−1 =k′a2
n′−1 − δ2n′(k′ − 2)δ2n′ − k′δ2n′−2
.
Then we find
c2n′−1
s2n′−1
=(k′ − 1)δ2n′ − k′δ2n′−1
k′a2n′−1 − δ2n′
=k′(δ2n′ − δ2n′−1)− δ2n′
k′a2n−′1 − δ2n′
=k′a2
n′ − δ2n′k′a2
n′−1 − δ2n′.
41
(3.29) then becomes
n′−k′∑i=1
(mn′i )2 =
(δ2n′ − k′a2
n′−1
δ2n′
)k′a2
n′ − δ2n′k′a2
n′−1 − δ2n′
=δ2n′ − k′a2
n′
δ2n′,
and we see that (3.24) is satisfied for j = n′ when tn′−k′ < tn′−k′+1−1 and tn′−k′+1 =
n′. If however, tn′−k′+1 < n′, setting j − 1 = tn′−k′+1 − 1, we have that (3.24)
is satisfied for j = tn′−k′+1 by Case III. Now, for i = n′ − k′ + 1, the sequence
{(j − n′ + k′)δ2n′ − k′δ2j }n′j=tn′−k′+1
is increasing since
((j − n′ + k′)δn′ − k′δ2j
)+ δn′ − k′a2
j+1 = (j + 1− n′ + k′)δn′ − k′δ2j+1
and δn′ − k′a2j+1 ≥ 0. But since the first term of {(j − n′ + k′)δ2n′ − k′δ2j }n
′j=tn′−k′+1
is not negative, and the last term is 0 by remark 3.1, it follows that
{(j − n′ + k′)δ2n′ − k′δ2j }n′j=tn′−k′+1
= {0}n′j=tn′−k′+1,
from which we find that δn′ − k′a2j = 0 for j = tn′−k′+1 + 1, . . . , n′. Thus,
n′−k′∑i=1
(mji ) = 0
for j = tn′−k′+1 + 1, . . . , n′. Since c2tn′−k′+1= 0, we have that mj
i = 0 for ev-
ery i = 1, . . . , n′ − k′, where j = tn′−k′+1 + 1, . . . , n′, so (3.24) is satisfied for
j = tn′−k′+1, . . . , n′ when tn′−k′ < n′ − 1
Case VI: tn′−k′ = n′ − 1.
That tn′−k′ = n′ − 1 implies tn′−k′+1 = n′. Thus, tn′−k′ = tn′−k′+1 − 1. Setting
j = n′ − 1, we have by assumption
n′−k′∑i=1
(mn′−1i )2 =
tn′−k′−1∑l=1
(s2tlc
2tl+1 · · · c2n′−2s
2n′−1
)+ c2n′−1
=δ2n′ − k′a2
n′−1
δ2n′,
42
from which it follows that
n′−k′∑i=1
(mn′i )2 =
tn′−k′−1∑l=1
(s2tlc
2tl+1 · · · c2n′−2c
2n′−1
)+ s2n′−1
=tn′−k′−1∑l=1
(s2tlc
2tl+1 · · · c2n′−2s
2n′−1
) c2n′−1
s2n′−1
+ s2n′−1,
giving us
(3.30)n′−k′∑i=1
(mn′i )2 =
(δ2n′ − k′a2
n′−1
δ2n′− c2n′−1
)c2n′−1
s2n′−1
+ s2n′−1.
We again use (3.22) and (3.23) to find
c2n′−1 =k′δ2n′ − k′δ2n′−1
k′δ2n′ − k′δ2n′−2
,
s2n′−1 =k′a2
n′−1
k′δ2n′ − k′δ2n′−2
.
Now, we calculate
c2n′−1
s2n′−1
=k′δ2n′ − k′δ2n′−1
k′a2n′−1
=a2n′
a2n′−1
and
δ2n′ − k′a2n′−1
δ2n′− c2n′−1 = 1−
k′a2n′−1
δ2n′−
k′a2n′
k′δ2n′ − k′δ2n′−2
,
whence(δ2n′ − k′a2
n′−1
δ2n′− c2n′−1
)c2n′−1
s2n′−1
=a2n′
a2n′−1
−k′a2
n′
δ2n′−
a2n′
a2n′−1
·k′a2
n′
k′δ2n′ − k′δ2n′−2
,
43
and adding s2n′−1 we see (3.30) becomes
n′−k′∑i=1
(mn′i )2 =
a2n′
a2n′−1
−k′a2
n′
δ2n′−
a2n′
a2n′−1
·k′a2
n′
k′δ2n′ − k′δ2n′−2
+k′a2
n′−1
k′δ2n′ − k′δ2n′−2
= −k′a2
n′
δ2n′+
a2n′
a2n′−1
(1−
k′a2n′
k′δ2n′ − k′δ2n′−2
)+
k′a2n′−1
k′δ2n′ − k′δ2n′−2
= −k′a2
n′
δ2n′+
a2n′
a2n′−1
(k′δ2n′ − k′δ2n′−2 − k′a2
n′
k′δ2n′ − k′δ2n′−2
)+
k′a2n′−1
k′δ2n′ − k′δ2n′−2
= −k′a2
n′
δ2n′+
a2n′
a2n′−1
(k′δ2n′−1 − k′δ2n′−2
k′δ2n′ − k′δ2n′−2
)+
k′a2n′−1
k′δ2n′ − k′δ2n′−2
= −k′a2
n′
δ2n′+
a2n′
a2n′−1
(k′a2
n′−1
k′δ2n′ − k′δ2n′−2
)+
k′a2n′−1
k′δ2n′ − k′δ2n′−2
= −k′a2
n′
δ2n′+k′a2
n′ + k′a2n′−1
k′δ2n′ − k′δ2n′−2
= −k′a2
n′
δ2n′+ 1
=δ2n′ − k′a2
n′
δ2n′,
and (3.24) is satisfied for j = n′ and tn′−k′ = n′ − 1. Finally, we have that (3.24) is
satisfied for all j = 1, . . . , n′ and the theorem is proven.
3.3.3 Finding r(k, n) and M
Theorem 3.5. Let B′ be a box of dimensions 2a1× . . .×2an′ where 0 < a1 ≤ . . . ≤a′n. The following statements are equivalent:
1. There is a k′-ball of maximal radius tangent to all faces of B′,
2. an′ ≤δn′−1√k′ − 1
.
Proof. (1) ⇒ (2) : Suppose there is an k′-ball of maximal radius tangent to all the
faces of B′. Then d1, . . . , dn′ = t, and (3.18) gives t = δn′√k′
. (3.2) now ensures us
that an′ ≤δn′√k′
, and (3.3) shows an′ ≤δn′−1√k′−1
.
(2) ⇒ (1) : Assume an′ ≤δn′−1√k′−1
. Then (3.3) gives an′ ≤δn′√k′
, or equivalently,
k′a2n′ ≤ δ2n′ . Let m1, . . . ,mn′−k′ be as in theorem 3.4 and let M be a k′-flat such
44
that M⊥mi for i = 1, . . . , n′ − k′. Then,
(3.31)n′−k′∑i=1
(mji )
2 =δ2n′ − k′a2
j
δ2n′, j = 1, . . . , n′.
Therefore, (3.17) gives
d2j =
a2j
1−∑n′−k′
i=1 (mji )2
= a2j
(1−
δ2n′ − k′a2j
δ2n′
)−1
= a2j
(k′a2
j
δ2n′
)−1
=δ2n′
k′
for every j = 1, . . . , n′. The maximality follows from (3.18): If we want to increase
the radius, we need to increase all the dj ’s. But an increase in one, say dj1 , requires
a decrease in another, say dj2 , where ji 6= j2 which would only result in a ball of
strictly smaller radius. It now remains only to show that the point of Mj closest
to the origin lay in the jth face of B′. We will argue by contradiction. Suppose for
some j = 1, . . . , n′, the closest point ofMj to the origin, say pj , is not contained in
B′. Construct a line l from the origin to pj . Since 0 ∈ B′, and l ⊂M, there exists an
i = 1, . . . , n′, i 6= j, such that l ∩Mi = {pi}. It follows that di ≤ ‖pi‖ < ‖pj‖ = dj .
Thus, we have that di 6= dj , clearly contradicting their equality. Thus we have that
the k′-flat M given by the equations
〈mi,x〉 = 0 for i = 1, . . . , n′ − k′
with the mji ’s defined as in (3.31), contains an k′-ball of maximal radius that is
tangent to all faces of B′.
Theorem 3.6. Let s be the smallest integer satisfying
(3.32) an−s ≤δn−s−1√k − s− 1
.
Then the maximal radius of B ⊂ B occurs when d1 = . . . = dn−s = t with the radius
45
being their common value. That is,
(3.33) r(k, n) = t =δn−s√k − s
.
Furthermore, mn−s+1i = . . . = mn
i = 0 for i = 1, . . . n− k.
Proof. If s = 0 is the smallest integer satisfying (3.32) then an ≤ δn−1√k−1
and we have
(from theorem 3.5 with n′ = n, k′ = k)
d1 = . . . = dn = t =δn√k
which gives us (3.33).
Now suppose s = 1 is the smallest integer satisfying (3.32). Then an >δn−1√k−1
and an−1 ≤ δn−2√k−2
. Intersecting B with the hyperplane xn = 0 yields an (n− 1)-box
of dimensions 2a1× . . .×2an−1. Intersecting any k-ball contained in B that doesn’t
lie parallel to any face of B with the same hyperplane gives a (k−1) ball of the same
radius that is contained in the (n−1)-box. From this, we have r(k−1, n−1) ≥ r(k, n).
Applying theorem 3.5 with n′ = n− 1 and k′ = k − 1, we find d1 = . . . = dn−1 = t
where t = δn−1√k−1
= r(k− 1, n− 1) where the (k− 1)-ball is tangent to all faces of the
(n− 1)-box. Using the mji ’s from (3.31), and setting mn
i = 0 for i = 1, . . . , n−k, we
extend the (k − 1)-flat containing the (k − 1)-ball to a k-flat M containing a k-ball
of radius δn−1√k−1
. The ball lies in B since an >δn−1√k−1
. It is maximal in B because
r(k, n) cannot exceed r(k − 1, n− 1). The k-flat M is unique (up to the symmetry
of B) as guaranteed by (3.6).
Now suppose s = 2 is the smallest integer satisfying (3.32). Then an−1 >δn−2√k−2
and an−2 ≤ δn−3√k−3
. Intersecting B with the two hyperplanes xn = 0 and xn−1 = 0
gives us an (n − 2)-box of dimensions 2a1 × . . . × 2an−2. Intersecting any k-ball
contained in B (again, not parallel to any face of B) with the same two hyperplanes
yields a (k− 2)-ball of equal radius which is contained in the (n− 2)-box. It follows
that r(k− 2, n− 2) ≥ r(k, n). Applying theorem 3.5 with n′ = n− 2 and k′ = k− 2,
we have d1 = . . . = dn−2 = t with t = δn−2√k−2
= r(k−2, n−2) and that the (k−2)-ball
is tangent to all faces of the (n − 2)-box. Using the mji ’s from (3.31), and setting
mn−1i = mn
i = 0 for i = 1, . . . , n − k, we extend the (k − 2)-flat containing the
(k − 2)-ball to a k-flat M containing a k-ball of radius δn−2√k−2
. This ball lies in B
since an ≥ an−1 >δn−2√k−2
and it is maximal as r(k, n) is bounded from above by
46
r(k − 2, n− 2). M is unique up to the symmetries of B by (3.6).
We continue in this fashion until an s is found so that (3.32) is satisfied, which
must happen since it is clearly true for s = k − 1.
It has been our assumption in this proof that the k-flat M was not parallel to
Hj for j = 1, . . . , n. At the beginning of this section, we commented that M could
actually be parallel to at most n − k faces of B. We said we could view the k-ball
lying on such a k-flatM as a k-ball in the (n−f)-box of dimensions 2af+1, . . . , 2an,
and that the maximal radius such a k-ball could have in this case was r(k, n − f)
where f = 1, . . . , n−k was the number of faces of B that the k-flatM is parallel to.
We also remarked that any k-ball lying onM was not parallel to any of the faces of
the (n−f)-box. We will now show that r(k, n−f) < r(k, n) where r(k, n) is defined
by (3.33) and s is defined by (3.32). If f = n−k, then r(k, n−f) = r(k, k) = an−k+1.
But, since s < k,
δ2n−sk − s
≥δ2n−k + (k − s)a2
n−k+1
k − s
=δ2n−kk − s
+ a2n−k+1
> a2n−k+1
and so the k-ball cannot be maximal in this case. If f = 1, . . . , n − k − 1, then in
light of (3.33), we have that
r(k, n− f) =
√∑n−si=f+1 a
2i√
k − s,
where s is the smallest integer satisfying an−s ≤qPn−s−1
i=f+1 a2i√
k−s−1. Since
qPn−s−1i=f+1 a
2i√
k−s−1<
δn−s−1√k−s−1
, we have an−s <δn−s−1√k−s−1
from which we can conclude that s ≤ s. For
j = 1, . . . , s− s, we have an−s+j >Pn−s+j−1
i=f+1 a2i
k−s+j−1 , and thus
(k − s+ j − 1)a2n−s+j −
∑n−s+j−1i=f+1 a2
i
(k − s+ j − 1)(k − s+ j)=
a2n−s+j
k − s+ j−
∑n−s+j−1i=f+1 a2
i
(k − s+ j − 1)(k − s+ j)> 0.
47
Then,
∑n−s+j−1i=f+1 a2
i
k − s+ j − 1<
∑n−s+j−1i=f+1 a2
i
k − s+ j − 1+
a2n−s+j
k − s+ j−
∑n−s+j−1i=f+1 a2
i
(k − s+ j − 1)(k − s+ j)
=
∑n−s+j−1i=f+1 a2
i
k − s+ j+
a2n−s+j
k − s+ j
=
∑n−s+ji=f+1 a
2i
k − s+ j.
It then follows that
[r(k, n− f)]2 =∑n−s
i=1 a2i
k − s<
∑n−s+1i=1 a2
i
k − s+ 1< . . . <
∑n−si=f+1 a
2i
k − s<
δ2n−sk − s
= [r(k, n)]2.
and so the k-ball cannot be maximal if f = 1, . . . , n − k − 1, or in other words, if
any k-ball B lies in a k-flat M that is parallel to any of the Hj ’s then B cannot be
of maximal radius in B.
Corollary 3.2. The n− k normal vectors to M satisfy the following conditions:
〈mi,mj〉 =
1 if i = j,
0 if i 6= j,(3.34)
n−k∑i=1
(mji )
2 =
δ2n−s − (k − s)a2
j
δ2n−sif j = 1, . . . , n− s,
0 if j = n− s+ 1, . . . , n,(3.35)
for i = 1, . . . , n− k where s is as in theorem 3.6.
Proof. (3.34) is just (3.15). From theorem 3.6, mn−s+1i , . . . ,mn
i = 0 for i = 1, . . . , n−k. From theorem 3.5, with n′ = n− s, k′ = k − s, we have
n−k∑i=1
(mji )
2 =δ2n−s − (k − s)a2
j
δ2n−s.
3.3.4 The (k, 4) Problem
Finally, we state all results for (k, 4) problem. As mentioned in the introduction,
writing out explicit solutions becomes difficult when n−k becomes large. Because of
48
this, the need for a computer algorithm arises, and the reader will find one supplied
in the Appendix. We will examine three cases, one each for k = 1, k = 2, and k = 3.
In each case, we state first all possible forms of the radius r(k, 4), followed by the
resulting 4− k equations for the location of these balls.
Case I: k = 1
This is the simplest case, as we do not need to check any conditions on the aj ’s or
the δj ’s. For any box B, we have
r(1, 4) = δ4,
and
m1 =
26666666666666666664
pδ24 − a2
1
δ4
− a1a2
δ4pδ24 − a2
1
− a1a3
δ4pδ24 − a2
1
− a1a4
δ4pδ24 − a2
1
37777777777777777775
,m2 =
26666666666666666664
0
sδ24 − δ22δ24 − a2
1
− a2a3p(δ24 − a2
1)(δ24 − δ22)
− a2a4p(δ24 − a2
1)(δ24 − δ22)
37777777777777777775
,m3 =
26666666666666664
0
0
a4pδ24 − δ22
− a3pδ24 − δ22
37777777777777775.
Case II: k = 2
This would seem like the most complicated case, since s can be either 0 or 1, and we
can pick either the first and second or the first and third columns of R depending on
whether δ2n ≥ 2δ22 or δ2n < 2δ22 respectively. However, it is not possible for δ2n < 2δ22 ,
so we only need to consider using the first and second columns of R to locate the
ball of maximal radius. We have the radius as
r(2, 4) =
δ4√
2if a4 ≤ δ3,
δ3 if a4 > δ3,
49
and our normal vectors are given by
m1 =
√δ24 − 2a2
1
δ4
− 2a1a2
δ4√δ24 − 2a2
1
− a1
δ2δ4
√(δ24 − 2δ22)(δ24 − 2a2
3)δ24 − 2a2
1
− a1
δ2δ4
√(δ24 − 2δ22)(δ24 − 2a2
4)δ24 − 2a2
1
,m2 =
0
√δ24 − 2δ22δ24 − 2a2
1
−a2
δ2
√δ24 − 2a2
3
δ24 − 2a21
−a2
δ2
√δ24 − 2a2
4
δ24 − 2a21
when a4 ≤ δ3, and they are given by
m1 =
√δ23 − a2
1
δ3
− a1a2
δ3√δ23 − a2
1
− a1a3
δ3√a2
2 + a23
0
,m2 =
0
√δ23 − δ22δ23 − a2
1
− a2√a2
2 + a23
0
when a4 > δ3.
Case III: k = 3
In this last case, we have to consider the cases when s = 0, 1, or 2. We find that the
radius is
r(3, 4) =
δ4√
3if a4 ≤ δ3√
2,
δ3√2
if a4 >δ3√2, and a3 ≤ δ2,
δ2 if a3 > δ2,
50
and m1 is one of
√δ24 − 3a2
1
δ4
−√δ24 − 3a2
2
δ4
−√δ24 − 3a2
3
δ4
−√δ24 − 3a2
4
δ4
,
√δ23 − 2a2
1
δ3
−√δ23 − 2a2
2
δ3
−√δ23 − 2a2
3
δ3
0
,
√δ22 − a2
1
δ2
−√δ22 − a2
2
δ2
0
0
when a4 ≤ δ3√2, a4 >
δ3√2
and a3 ≤ δ2, or a3 > δ2 respectively. This concludes all
possible cases of the (k, 4) problem.
51
Bibliography
[1] Arthur M. DuPre and Seymour Kass. Distance and parallelism between flats in
Rn. Linear Algebra and its Applications, 171:99–107, 1992.
[2] Hazel Everett, Ivan Stojmenovic, Pavel Valtr, and Sue Whitesides. The largest
k-ball in a d-dimensional box. Comput. Geom., 11(2):59–67, 1998.
[3] Jurgen Gross and Gotz Trenkler. On the least squares distance between affine
subspaces. Linear Algebra and its Applications, 237(1):269–276, 1996.
[4] Seymour Kass. Spaces of closest fit. Linear Algebra and its Applications, 117:93–
97, 1989.
[5] David C. Lay. Linear Algebra and its Applications. Addison Wesley Longman,
2nd edition, 1999.
52
Appendix A
An Algorithm to Find
m1, . . . , mn−k
The following Maple code calculates the n−k normal vectors to the k-flatM which
contains a k-ball of maximal radius r(k, n) in a box B of dimensions 2a1× . . .× 2an
with 0 < a1 ≤ . . . ≤ an. Simply adjust the values of k and n with k < n at
the beginning of the program, as well as the values for a1, . . . , an. The vector
t = (t1, . . . , tn−1) is such that tj = 1 if the jth column of R is used, and tj = 0 if
it is not, and the jth entries of cosvector and sinvector give the value of cos2(θj)
and sin2(θj) respectively for j = 1, . . . , n − s − 1. The italicized text before each
block of code are just comments, and do not need to be copied into a maple file.
restart:
Input k and n values here
k:=7:
n:=8:
Create an array called avector for the measurements of the box in each dimension,then input a value for each of the measurements a1, . . . , an in ascending order andfind the resulting value for sval, which is just the value of s in chapter 3.
avector:=array(1..n):
53
for count1 from 1 to n do
avector[count1]:=1:
od:
sval:=0:
conditioncheck:=-1:
for counter from 0 to k-2 while conditioncheck<=0 do
conditioncheck:=sum(avector[c]^2,c=1..n-sval-1)/(k-sval-1)-avector[n-sval]^2:
if(conditioncheck<=0) then sval:=sval+1:
fi:
od:
Create arrays containing :1) tvector - an array which records the values of the ti’s(For example, t = [1, 0, 0, 1, 1] implies t1 = 1, t2 = 4, t3 = 5)
2) solvector - jth entry containsδ2n−sval−a
2j
δ2n−sval
3) cosvector - contains the values of c1, . . . , cn−s−1
4) sinvector - contains the values of s1, . . . , sn−s−1
5) mmatrix - matrix whose columns are the n − k normal vectors to the k flatcontaining the k-ball of maximal radius in the n-box with dimensions given in avector
tvector:=array(1..n-sval):
solvector:=array(1..n-sval):
cosvector:=array(1..n-sval-1):
sinvector:=array(1..n-sval-1):
mmatrix:=array(1..n-k):
Initializations:1) tvector[1] set to 1 since t1 = 1, all else set to 02) solvector3) cosvector[1] set to solvector[1], all else unassigned4) sinvector[1] set to 1-cosvector[1], all else unassigned5) mmatrix set to zero matrix6) tcounter set to 1, to keep track of how many of the ti’s have been assigned.
tvector[1]:=1:
for count2 from 2 to n-sval-1 do
tvector[count2]:=0
od:
for count3 from 1 to n-sval do
solvector[count3]:=1-((k-sval)*avector[count3]^2)/sum(avector[b]^2,b=1..n-sval):
od:
cosvector[1]:=solvector[1]:
sinvector[1]:=1-cosvector[1]:
for count4 from 1 to n-k do
mmatrix[count4]:=array(1..n):
for count5 from 1 to n do
54
mmatrix[count4][count5]:=0:
od:
od:
tcounter:=1:
Searches for ti’s to use, stops when tcounter = n−k and assigns values to cosvectorand sinvector along the way. Breakpoint used to record where the last ti was found
breakpoint:=1:
for count6 from 2 to n-sval-1 while tcounter<n-k do
check:=sum(tvector[j]*sinvector[j]*product(cosvector[h],h=j+1..count6-1),j=1..count6-1):
if (check<solvector[count6])
then
cosvector[count6]:=(solvector[count6]-check)/(1-check):
sinvector[count6]:=1-cosvector[count6]:
tcounter:=tcounter+1:
tvector[count6]:=1:
else
sinvector[count6]:=solvector[count6]/check:
cosvector[count6]:=1-sinvector[count6]:
fi:
breakpoint:=count6:
od:
Computes any remaining entries in cosvector and sinvector from breakpoint onwards
for i from breakpoint+1 to n-sval-1 do
sinval:=sum(tvector[j]*sinvector[j]*product(cosvector[h],h=j+1..i-1),j=1..i-1):
sinvector[i]:=solvector[i]/sinval:
cosvector[i]:=1-sinvector[i]:
od:
Assigns entries to mmatrix
position:=1:
for count7 from 1 to n-sval-1 do
if (tvector[count7]=1)
then
mmatrix[position][count7]:=sqrt(cosvector[count7]):
for count8 from count7+1 to n-sval-1 do
mmatrix[position][count8]:=
-sqrt(sinvector[count7]*product(cosvector[h],h=count7+1..count8-1)
*sinvector[count8]):
od:
mmatrix[position][n-sval]:=
-sqrt(sinvector[count7]*product(cosvector[h],h=count7+1..n-sval-1)):
55
position:=position+1:
fi:
od:
Outputs the n− k normal vectors
print ("THE", n-k ,"NORMAL VECTORS");
for count9 from 1 to n-k do
print (count9,mmatrix[count9]);
od;
56