![Page 1: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/1.jpg)
1
What happens to the location estimator if we minimize with a power
other that 2?
Robert J. Blodgett
Statistic Seminar - March 13, 2008
![Page 2: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/2.jpg)
Outline
1. Affect of different exponents2. Definition3. End points of the path4. Repeated exponents5. Bounds6. Directions of the path7. Outliers
![Page 3: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/3.jpg)
When data points come from a symmetric unimodal distribution, the clear location estimator is the maximum. Different exponents are compared for three such distributions. For each exponent, 10,000 simulations had 20 points each.
The curves show number of simulations within ¼, ½, and 1 of the maximum, 0.
1. Affect of different exponents
![Page 4: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/4.jpg)
Uniform on (-5, 5)
t ot 1
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
power
1. 0 2. 0 3. 0 5. 0 7. 5 10. 0
![Page 5: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/5.jpg)
Normal – N(0, 1)t ot 1
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
power
1. 0 2. 0 3. 0 5. 0 7. 5 10. 0
![Page 6: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/6.jpg)
Double Exponentialt ot 1
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
power
1. 0 2. 0 3. 0 5. 0 7. 5 10. 0
![Page 7: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/7.jpg)
The lp-norm estimator minimizes the following sum as m(p) changes with x
k … x
1 .
2. Definition
![Page 8: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/8.jpg)
K
j
p
j pmx1
)(
for 1< p < ∞
![Page 9: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/9.jpg)
0)()(1
1
pmxsignpmx j
K
j
p
j
Setting the derivative with respect to m(p) equal to zero gives
.
Expression (1)
![Page 10: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/10.jpg)
Why can we take a derivative?
Since p > 1, the limit equals 0.
10
11
0
lim
0
lim
10
11
0
lim
0
lim
1
1
pif
pifx
xx
x
x
pif
pifx
xx
x
x
pp
pp
![Page 11: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/11.jpg)
As m(p) increases, each term in expression (1) decreases. Hence, for each exponent, there is a unique minimizing m(p). The minimizing point varies continuously with the exponent.
The next step is to locate the end points.
![Page 12: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/12.jpg)
The limit median is the limit of the minimizing points as the exponent approaches 1.
Properties: (1) For an odd number of data point or when the middle two are equal, the middle one is the limit median. (2) Otherwise, the limit median is between the two middle data points.
3. End points of the path
![Page 13: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/13.jpg)
D. Jackson (1921), “Notes on the median of a set of numbers,” Bul. Amer. Mathemat. Soc. 27, pp. 160-164.
Pair the jth and (K-j+1)st points. Let xL
denote the lower point and xU
the upper
point. The limit median is at the solution to
. pairsall
Upairsall
L mxxm
![Page 14: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/14.jpg)
oThe following bound shows how to assign the minimum when the exponent approaches infinity. The minimum and the midrange are within
11
11
21
1
1
1
p
p
K
KR
where R denotes the range and K > 1.
![Page 15: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/15.jpg)
When the exponent approaches 1, the term in parenthesis approaches 1. Consequently, this bound would include all the data points then. When exponent approaches infinity, this term approaches zero. Consequently, the minimizing point approaches the midrange as the exponent approaches infinity.
![Page 16: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/16.jpg)
As an example, a path is constructed for the data set with the five points 0, 0, 2, 2.1, and 4.
expon
1
2
3
4
5
6
7
8
9
10
val
0 1 2 3 4
![Page 17: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/17.jpg)
For how many exponents can have the same minimum?
4. Repeated Exponents
![Page 18: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/18.jpg)
Let aj = sign (x
j – m) and b
j = |x
j-m|
where the sum is over all points the same distance from m.
The following theorem bounds the number of repartitions.
![Page 19: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/19.jpg)
If (1) each aj is a non-zero, real number and (2) bN >…> b1 > 0, then the number of changes in sign of the aj-values is greater than or equal to the number of x-values where ajbj
x equals zero.
The following proof is similar to one in Laguerre (1883).
![Page 20: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/20.jpg)
Proof. (By induction on the number of changes in sign.) If there are no changes in sign, then no x-value makes the sum zero.
For the induction step assume the aj-values have C changes in sign and the result holds for any exponential sum with fewer changes in sign. Let a1 to aw all have the same sign and let aw+1 have the opposite sign.
![Page 21: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/21.jpg)
x
w
jN
jj
x
w
x
j
N
jj b
babba
11
w
j
x
w
jN
jj
x
w
jN
jj b
b
b
ba
b
ba
dx
dlog
11
![Page 22: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/22.jpg)
Since bN >…> b1 > 0, when j > w, log(bj/bw) > 0. Thus, for j > w, the jth coefficient of this derivative has the same sign as aj. When j < w, log(bj/bw) < 0. Thus, for j < w, the jth coefficient of this derivative has the opposite sign from aj. Hence, the sign no longer changes at j = w. Therefore, the coefficients of this derivative have at most C – 1 changes in sign. By the induction hypothesis, this derivative has at most C – 1 x-values where it equals 0. By Rolles theorem, the original sum can have at most C x-values where it equals 0. ■
![Page 23: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/23.jpg)
A similar proof of Descartes’ rule of signs works for polynomials. Exponential sums and polynomial like sums where the exponents maybe any real numbers are closely related.
.11
1111
jjj uK
jj
K
j
ux
j
xuK
jj
x
j
K
jj yabababa
Let uj = (log bj)/(log b1) and y = (b1)x.
![Page 24: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/24.jpg)
Repeated exponents also has been disguised as a number theory problem. One example from number theory is the following equalities for j = 0, 1, 2, 3, 4, or 5.
1j + 12j + 21j + 43j + 52j + 63j
= 3j + 7j + 28j + 36j + 57j + 61j
Notice that there are 6 changes in sign.
![Page 25: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/25.jpg)
5. Bounds
Order the data points and let
cj = (xj + xK-j+1)/2.
Let c- = minimum cj and
c+ = maximum cj.
For all p in (1, ∞), m(p) is in [c-, c+].
![Page 26: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/26.jpg)
Proof. When m > cj, xj – m is negative and |xj – m| |xK-j+1 – m|. Thus, the contribution of this pair to expression (1) is negative. When m is greater than all cj-values, the contribution from each pair is negative (including the median if K is odd.) Thus, expression (1) is not zero. Similarly if m less than all cj. ■
![Page 27: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/27.jpg)
It also follows that1) If c- < c+, then m(p) is in (c-, c+) for all p in (1, ∞).2) The limit median is in [c-, c+].3) The Winsorized mean and the usual median are both in [c-, c+].
The following graphs show the ratio of the actual length of an interval covered divided by c+ - c-. The curves show the number out of 10,000 simulations with points from N(0, 1) below .5, .7, and .9.
![Page 28: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/28.jpg)
Even size groupspr t 5
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
gr p_ s i ze
4 6 8 10 12 14 16 18 20
![Page 29: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/29.jpg)
Odd size groupspr t 5
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
gr p_ s i ze
5 7 9 11 13 15 17 19
The extra curve is for the number below 1.
![Page 30: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/30.jpg)
In the example of data set 0, 0, 2, 2.1, and 4, the pairs of data points are as follows. First, 0 and 4 with average 2; next, 0 and 2.1 with average 1.05; and finally 2. Consequently, (c-, c+) = (1.05, 2). Any minimum in this interval has at most 2 exponents.
![Page 31: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/31.jpg)
The next two results contain the idea that once the high or low terms dominate the expression, they will continue to dominate it.
1. If (1) each aj is a non-zero, real number, (2) bK >…> b1 > 0, and
0
K
sj
q
jjba
K
j
x
jjba1
(3) for s = 1, …, K,
then > 0 for x q.
![Page 32: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/32.jpg)
2. If (1) each aj is a non-zero, real number, (2) bK >…> b1 > 0, and
01
s
j
q
jjba(3) for s = 1, …, K,
then
K
j
x
jjba1
> 0 for 0 < x q.
![Page 33: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/33.jpg)
6. Directions of the path
For p in (1, ∞), direct calculations give
Taking the sign gives the direction the estimate moves. From the directions, iteration gives the points where the path turns.
.)1(
)(log
1
2
1
1
K
j
p
j
jj
K
j
p
j
mxp
mxsignmxmx
dp
dm
![Page 34: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/34.jpg)
The following result helps find the direction at p = 1 or infinity.
Let S denote the entire data set and M its minimizing point at p. Let S* denote the data set with xj omitted and M* its minimizing point at p.
If S has more than one data point and p is in (1, ∞), then (1) M > xj if and only if M* > xj, (2) M = xj if and only if M* = xj, and (3) M < xj if and only if M* < xj.
![Page 35: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/35.jpg)
For the exponent at infinity the following result gives the direction of approach.
If (1) the cj-values are not all identical and (2) cF denotes the average of the pair with the largest range whose average does not equal c1, then for sufficiently large p both m(p) and cF are on the same side of c1.
A similar result holds when near p = 1 when K is odd or the middle two data points are equal.
![Page 36: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/36.jpg)
K
jj
jj
K
j
mx
mxsignmx
dp
dm
p
1
1
1
2
2
log
1
lim
When K is even and the middle two data points are different, the following expression can indicate the direction at the limit median.
![Page 37: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/37.jpg)
expon
1
2
3
4
5
6
7
8
9
10
val l
1. 050 1. 617 2. 000
The example data set has a turning point at an exponent of 2.12. There is at most two exponents for each value of the estimator.
![Page 38: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/38.jpg)
7. Outliers
This section contains two graphs of simulation results. Six data points were taken from a N(0, 1). After calculating their location estimator, an outlier was added. The location estimator was recalculated. At each exponent, the number of the 10,000 simulations with differences below 1/8, 1/4 and 1/2 are shown.
![Page 39: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/39.jpg)
Outlier at 2.
t ot 1
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
power
1. 0 2. 0 3. 0 5. 0 7. 5 10. 0
![Page 40: What happens to the location estimator if we minimize with a power other that 2?](https://reader034.vdocument.in/reader034/viewer/2022050714/56812c2c550346895d90aa2d/html5/thumbnails/40.jpg)
t ot 1
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
power
1. 0 2. 0 3. 0 5. 0 7. 5 10. 0
Outlier at 3
The extra curve is for differences below 1.