degrees of freedom: a correction to chi-square for physical hypotheses
TRANSCRIPT
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
1/47
De gree s of Free dom : A
Correc tion to Chi S quare
For P hysical Hypothese s
by J ohn Micha el Williams
2010-08-15
jmm will@comcast .ne t
Copyright (c) 2010, by J ohn Micha el William s
All Rights Rese rved
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
2/47
J . M. Williams Chi Squar e df v. 1.23 1
Abstract
In common practice, degrees of freedom (df) may be corr ected for t he n um ber of
th eoretical free para meter s as though para meter s were th e same as data cat egories.However, a free physical pa ra meter generally is n ot equivalent to a dat a cat egory in
terms of goodness of the fit.
Her e we use syn th etic, nonr an dom da ta to show th e effect of choice of
categorization and dfon goodness of fit. We th en explain th e origin of th e df
problem an d show how to avoid it in a th ree-step pr ocess:
First , the t heoretical curve is fit t o the dat a t o remove its
varian ce, leaving what, u nder th e nu ll hypoth esis, should be
structureless residuals.
Second, the r esidua ls ar e fit by a set of ort hogona l polynomials u p
to th e degree, should it occur , at which significan t va ria nce was
removed.
Third, th e nu mber of nonsignifican t polynomial term s in th e
origina l + ort hogona l set become t he dfin a sta ndard chi square
test.
This process reduces a general dfpr oblem to one of polynomial dfand allows
goodness of a fit t o be deter min ed by dat a cat egorizat ion a nd s ignifican ce level
alone. An example is given of an evalu at ion of physical dat a on neu tr ino
oscillation.
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
3/47
J . M. Williams Chi Squar e df v. 1.23 2
Table of Conten ts
Abstract ______________________________________________________________________ 1
Table of Contents ______________________________________________________________2
I. Introduction ________________________________________________________________3
Parametric Statistics_______________________________________________________________ 3
Statistical Tests in Physics __________________________________________________________ 4
Doing Physics by Category__________________________________________________________ 4
Categories in NonRandom Chi Square ________________________________________________ 6
II. Meaning of Chi-Square _____________________________________________________16
Formalisms______________________________________________________________________ 16
Categories in Random-Sample Chi Square ___________________________________________ 21
Meaning of df____________________________________________________________________ 21
Need for a Correction to df_________________________________________________________ 24
III. The Parameter Correction __________________________________________________ 26
Definition of Correction ___________________________________________________________ 26
Application in Synthetic Examples __________________________________________________ 30
Application in Neutrino Oscillation Theory___________________________________________ 42
IV. Conclusion _______________________________________________________________46
References___________________________________________________________________46
Acknowledgements ____________________________________________________________46
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
4/47
J . M. Williams Chi Squar e df v. 1.23 3
I. Introduction
Pa r a m et r ic S t a t i st i cs
Hypoth eses ar e theoretical assu mpt ions. Scientific hypoth eses ma y be tested
against empirical dat a. It is funda ment al to all science, that no hypoth esis ever can
be proven by experimental dat a; an hypoth esis only may be r endered more or less
likely, or disproven.
The m ost fam ous m odern u se of forma l sta tist ics to test scient ific hypotheses
dat es back to R. A. Fish er [1], who, in th e early twent ieth centu ry, published a
meth od for an alysis of th e var iance in experimenta l data . Fisher developed
sensit ive ways of detectin g differen ces a mong da ta as a fun ction of category of
tr eatm ent. The treat ment s in his favorite exam ples were on agricultu ra l variables
such as crop fert ilizat ion, light, water ing, an d so fort h. Fish er's meth ods werebroad in scope--the nonparametric Fisher Exact Testis na med for h im--but his
an alysis of varian ce was based on th e apparen tly narr ow assumpt ion th at t he data
were sampled independently an d would be normally distr ibuted with const an t
variance.
Defining a r an dom var iableXwith n orm ally distr ibuted probability density
( )N X; , 2 by
( )( )
P x e
x
X = =
1
2 22
2
2
, (1)
it easily ma y be seen th at th e two par am eters of such a distribution are th e mean
and t he variance 2 . The mean (expected value) of dat a ma y be ta ken to define th e
effect of an experimenta l treat ment . A sam ple ta ken of dat a depending on a
mixtu re of differen t effects will have a bigger varia nce tha n a sa mple dependin g on
one; so, the var ian ce ma y be an alyzed to show differen ces in th e un derlying mea ns.
If we use m ean values to represent cat egories of data, t hen hypoth esis testing by
an alysis of var ian ce ma y be used to test differen ces am ong the cat egories. Fish er's
app roach to ana lysis of var ian ce cam e to be kn own a s param etric statis tics .
Fisher's a na lysis of var iance rat iona le has been su pplemented by Bayesianmet hods [2, 4]. Pr oblems not fitt ing Fish er's par adigm ha ve extended th e scope of
hypothesis-testing statistics into the nonparametric realm [3, Ch. 7; 6, Sect. 9.2 &
Ch. 12]. Nonpara metr ic sta tistics att empt to resolve questions for dat a in very
sma ll samples, for mismat ched var iances, an d for experiment al r esults sometimes
more qualitative than numerical.
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
5/47
J . M. Williams Chi Squar e df v. 1.23 4
Sta t i s t i ca l Tes t s in Physics
A problem with hypoth esis testing in physics is th at mu ch of th e subject m at ter is
mult iply par am eterized. In physics, a good hypoth esis ra rely sta tes anyth ing aboutcat egories; instea d, it provides an explan at ion of a set of one or more ph enomena
which may var y continu ously. The dat a typically involve broad ra nges of ener gy or
moment um , or avera ges of ma ny car efully winnowed events . Quest ions ar ise of, Is
th e spin 1/2? Does th e field var y as 1/r? The conn ection of an y physical hypothesis
with data typically is through multidimensional continua of complexity.
Rath er t ha n an alyzing experimenta l variat ions t o discover differences in m eans
on na rr ow doma ins of one or m ore cont inuu a, th e preferr ed way to demonst ra te
supp ort for a physical hypoth esis is to show tha t it m ay be used u nequ ivocally to
cont rol some ph ysical pr ocess in su ch a way a s no oth er kn own h ypoth esis might
explain.
For exam ple, a demonst ra tion of th e Ha ll effect would involve an unequivocal
excess of cha rge along one side of a flat condu ctin g str ip. To achieve this
confirmation of the predicted effect, superior instrumentation or measurement of
previously ignored qua nt ities would be adopted. The gra dient of th e cha rge, th e
effect's time-course of development, and other factors would be examined for
consist ency with th e hypoth esis. Rar ely would physical meas ur emen t be on such a
na rr ow ran ge of values tha t t he var iance would be const an t, independent of th e
mea n. Also, if not all evidence su pported th e hypothes is, alt ern at ives would be
explored inst ead. Merely showing th at one side of a condu ctin g str ip seemed more
likely to have a char ge excess th an th e other, a test between two cat egories on one
variable, would be a n un sat isfying r esult.
Doing Ph ysics by Ca t egor y
Let's consider t he special case of curve-fitt ing hypoth eses. The problem is to
distinguish among various hypoth eses of th e shape of th e cur ves. We might sta rt
by ar ra nging th e data a long one or more cont inua . However, on a cont inuu m, two
data (probably) never coincide; this means we have slim bases for estimating the
mean , var iance, or a ny oth er par am eter of th e assum ed underlying random
variable.
So, to use par am etr ic st at istics, we define cat egory boun dar ies (bins). By ma kingth e bins wide enough, t he n umber of dat a a vailable in each will be large enough t o
provide a good estimat e of th e bin mean . We ma y assu me th e var iance in a bin to
be const an t, but only if th e bins a re na rr ow enough. Sometimes riskily, we may
assu me th e variance over all bins t o be the sa me, const an t value.
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
6/47
J . M. Williams Chi Squar e df v. 1.23 5
A simple example will illust ra te th is approach. Ima gine th at t wo different
physical theories ar e to be test ed against a certa in data set. Perh aps, the th eories
might r elate t o field int ensity Ias a fun ction of dista ncex.
Hypothesis I: The dat a follow a quadr at ic fun ction. This would st ren gth en beliefin Theory I.
Hypothesis II: The data follow a cubic function. This would support Theory II.
Here a re th e two hypoth etical predictions on a da ta set which h as been generat ed
purely artificially, with no randomization:
Figure 1. Simulated data f i t by eye . Deviations from the data see m
greater for the cubic than for the quadrat ic . The curves represent
( )I x= 7 1 22
for the quadratic, and ( )I x= 25 1 23
for the cu bic.
Clear ly, visua l inspection tells us th at t he quadr at ic fit is better. So, we
imm ediat ely should drop Theory II in favor of Theory I. Or, should we? How can wetest how well Theory I ha s been confirm ed? Is it possible Theory II might be more
elegant to derive, simpler, or oth erwise more a tt ra ctive tha n Theory I, even th ough
th is data set was not fit so well?
We shall discuss t he special case in which we wish t o test Theory I again st th e
dat a, leaving out Th eory II entirely. This kind of sta tistical test, which mea sur es
th e fit r at her th an rejecting a competing fit, is called a goodness of fittest.
Distance x (arbitra ry units)
Field
Intensity
I
0.0
0.2
0.4
0.6
0.8
1.0
1.2
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Quadratic Fit
Cubic Fit
Data
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
7/47
J . M. Williams Chi Squar e df v. 1.23 6
The m ost popular goodness of fit t est is by ch i sq u a r e , a sta tistic first derived by
Helmert [2, Sect. 3.4.3] and later independently derived and popularized by Karl
Pear son; it is na med by Pearson after th e Greek letter chi (). Chi squa re simply
is th e distribut ion of the su m of squar es of a set of stan dar dized, norma lly
distributed data .
Categor ies in NonRand om Chi Squa re
Before pr esent ing form alities a bout chi squ ar e, we first develop its a pplicat ion t o
th e cur ve-fit exam ple in F igure 1 above. We ignore th e cubic cur ve for a wh ile.
One Cat egory
Suppose we comput ed th e mean values of the dat a a nd of th e quadr at ic cur ve on
th e interval of interest , 01 0 9. . x . We would get, usin g all 14 ava ilable dat a
points sh own in F igur e 1, a sta tistically invalid ana lysis but a n inst ru ctive exam ple
as sh own in Ta ble 1.
Table 1. Comparison of data vs quadratic means on one
category (1 degre e of freedo m)
Distance x Data
of I
Quadratic
0.151 0.73
0.201 0.59
0.251 0.44
0.301 0.300.351 0.18
0.401 0.08
0.451 0.02
0.501 0.00
0.551 0.02
0.601 0.08
0.651 0.18
0.701 0.31
0.751 0.45
0.801 0.60
mean = 0.2855 I = 0.3733
sa mple sd ofI= 0.2349 I = 0.3339
sa mple sd of m ea n 0.0628 I
= 0.0892
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
8/47
J . M. Williams Chi Squar e df v. 1.23 7
The mea n of th e qua dra tic was compu ted a s follows, using th e expected valu e
operator,E,
( )( ) ( )
I E Id x I x
dx
dx x
x= =
=
0 1
0 9
0 1
0 9
2
0 1
0 9
0 1
0 9
7 1 2
03733.
.
.
.
.
.
.
. . . (2)
The operator ( )E means t he same a s the familiar ;E is used her e for symm etry
with th e var iance operat or, ( )V .
The standard deviation of the quadratic was computed as follows,
( ) ( )( ) ( ) ( )[ ]( ) ( )
I V I E I E I E I E I sqrt d x I x
dx
d x I x
dx
= = = =
2 2 2
2
0 1
0 9
0 1
0 9
0 1
0 9
0 1
0 9
2
.
.
.
.
.
.
.
.
( )( )[ ]I sqrt
dx x
dx
=
7 1 2
0 373 0 3339
2 2
0 1
0 9
0 1
0 9
2.
.
.
. . . . (3)
The sta nda rd deviations of th e mean s were comput ed by dividing the sta nda rd
deviations of the variables (data or quadratic) by the square root of the sample size,
N = 14 3742. .
Now, inst ead of just noticing in Table 1 tha t t he mea ns a re about 0.09 apa rt , and
th at t he sta nda rd deviat ions of th e means a re about th e same size, suggesting no
significan t difference between t he t heory a nd th e dat a, let's look a t th e chi squa re
statistic. A unit-normal distribution is defined as one with m ean 0 an d varia nce 1,
we describe such a distribution as ( )N 0 1, , a notat ion which may be relat ed to
equa tion (1) in a n obvious way.We follow Brownlee [3, Sect. 1.27], an d other s in defining chi squ ar e as a sum of
squar ed un it-normal deviat es: Given a nu mber NC of data cat egories,
( )Data ii
N
NC
2 2
1
0 1==
, , (4)
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
9/47
J . M. Williams Chi Squar e df v. 1.23 8
with comput ed or ta bulated cum ulat ive probability P at significan ce level xp ,
( )[ ]P df x pp2 , represented as ( )p df2 .
In comm on usa ge, which we shall quest ion lat er, th e degrees of freedom var iable
dfis the n umber of sta tistically independent cat egories of th e data , and t he un it-normal deviates are differences between theoretical (or otherwise anticipated)
values an d data values, one for each cat egory. Becau se the differen ces are defined
sta nda rdized, their mea ns a ll will be expected t o be 0 under t he nu ll hypoth esis tha t
the fit is good.
Usua lly, the var iance in ea ch dfcat egory would be th e varian ce un der t he nu ll
hypoth esis; in our int roductory exam ples, th is is the var iance under th e assu mpt ion
th at th e data were distr ibuted ran domly in each cat egory as determin ed by th e
cur ve being fit. In t his intr oduction, we sha ll be looking at t he cur ve fit to dat a
generat ed under a th eory an d exam ining the adequa cy of th e representa tion of th at
th eory by th e fitt ed cur ve. So, we are looking at two differen t t heories, in a sen se,one a simplificat ion of th e other . In our int roductory examples, ther e is no ra ndom
component other th an rounding error of th e nu merical values.
In common u sage, we would estima te t he var iance in each cat egory u sing th e
dat a, th us losing a degree of freedom [4, Sect. 10.11]. In t hese int roductory
examples, we assume the appar atu s retur ns good data , and th at we are measuring
th e lawful realization of a physical principle. We assu me we know the mean s and
th e cha nge in th em within a cat egory; so, we need not use da ta to estimat e
an yth ing; we mer ely ar e mea sur ing th e fit of a cur ve. So, in the first few examples
following, we sha ll use th e varia nce of th e qua dra tic cur ve to sta nda rdize in each
cat egory, rath er th an t he varian ce of th e equally th eoretically corr ect dat a. Theactua l difference in t he var iances is not lar ge anyway, as ma y be seen in th e
tabulations below.
So, to evalua te chi squa re, we ma y compu te t he following sum :
( )
( )
( )[ ]( )
Datai i
ii
N
i
ii
NX E X
V X
X E X
V X
C C
2
2
1
2
1
=
=
= =
, (5)
in which, in each category, the t heoret ically expected va lue ( )E X is subtracted from
th e value of the observed datu m X. In genera l, each i-th cat egory r epresents a
mea n of some observat ions, from which observed mea n will be subt ra cted t he i-th
expected mean . Under t he nu ll hypoth esis th at t he fit is good, the difference in the
nu mer at or of th e sum in (5) will, of cour se, be zero. Dividing each differen ce by the
expected standard deviation ofX sta nda rdizes the var iance of each of th e i elements
in the sum.
We notice here, immediat ely, th at our chi squa re ignores t he t heoretically
expected shape of th e cur ve: It just sums squar ed differences. This mean s tha t chi
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
10/47
J . M. Williams Chi Squar e df v. 1.23 9
squar e is not tr uly a para metr ic sta tistic. Ea ch cat egory is sta nda rdized
individually by its own expected va ria nce; th erefore, chi squar e does not r equire a
const an t varian ce over all cat egories tested. This is a great advant age in physics.
The sha pe of the t heoretical curve enter s only locally, at th e gran ular ity of the
individua l cat egories, when a differen ce ( X Ei i ) is taken an d a variance estima tedperha ps on th e assum ption t ha t t he varian ce will be const an t.
Anyway, let's decide to reject t he fit at th e p = .001 level. Keep in min d th at we
do not have a valid ran dom sam ple an d tha t t his is mean t as a th ought -provoking
int roduction to a problem of sta tist ical inference in physics. The ta bles of chi
square tell us tha t ( ).0012 1 11 . We tr ivially comput e chi squar e from the Table 1
means t his way:
( ) ( )( )
DataX E
V2 1 1
2
1
2
2
2855 3733
089210=
=
. .
.. ; (6)
an d, confirmin g our visual impr ession, th e resu lt is not significan t, so th e fit ma y be
considered good to th e extent t he overall averages of the dat a a nd qu adr at ic
hypothesis in Table 1 are not shown differen t. And we ar e confident th e fit would
have been shown good with likelihood of error p = .001, ha d th e stat istical
requirements been fulfilled.
We should ment ion h ere th at th e difference in m eans in Table 1 also might h ave
been tested by oth er sta tistics, such a s Stu dent's t. Above, we have accepted t he
mean an d varia nce of the qua dra tic fun ction a s th eoretically errorless, and ha ve
accepted the quadratic's computed ( )N . ,.3733 0892 distribution a s th at of the data ,
un der the nu ll hypoth esis. In cont exts oth er tha n th e present int roduction, this
would be qu ite wrong sta tistically, becau se, visibly, the mean an d t he varian ce
cha nge systemat ically over t he domain of interest (sta nda rd deviation is a measu re
of slope)--an d we ha ve a qu adr at ic cur ve, not a ra ndom norma l variat e.
As a check, ignoring t hese pr oblems, at p = .001 (two-ta iled), a norm ally
distributed da ta mean would ha ve to lie about 3.3 stan dar d deviat ions of th e mean
awa y from th e qua dra tic mea n to reject th e goodness of th e fit. This would be about
0.30 distance units, clearly about 3 x greater than the 0.09 difference in Table 1.
Usin g the n orm al significan ce value in a second check, if we subst itu te 0.30 in th e
chi squ ar e nu mer at or of (6) above, we get 0 30 0 09 1112 2
. . . , closely mat chin g th e chisqua re significan ce level of 11. This is a t au tology, confirm ing our ar ith met ic,
because by definition ( ) ( )[ ]. . ,0012 0012
1 0 1 N .
Two Ca tegories
Now, let us double the nu mber of cat egories an d th e sam ple size, too. We
increase t he density of the dat a point s but keep the sa me qua dra tic fitting function:
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
11/47
J . M. Williams Chi Squar e df v. 1.23 10
Distance x
Field
Intensity
I
-0.1
0.1
0.3
0.5
0.7
0.9
1.1
0.2 0.3 0.4 0.5 0.6 0.7 0.8
x1x2
Figure 2. Two-category fi t by eye . More of the same arti f icial data as
in Fig. 1 , for the sam e qu adratic f i t ( ) ( )I a x b x= = 22
7 1 2 .
This time, we part ition t he domain of an alysis x int o two regions, or bins, called
x1 and x2 , as shown in Figure 2. The horizont al lines show th e different m eans ofI
in the two bins. We ta bulate th e data shown, an d recalculat e the para meter s of th e
qua dra tic separ at ely for th e two ha lves, as was done above. We cha nge the problem
slightly here, by integrat ing th is time on t he sm aller, more pr ecise doma in of
( )x . ,.15 85 instead of ( )x . ,.10 90 , mak ing the domain of evaluat ion of th e quadr at ic
ma tch the doma in of th e sampled dat a more closely. Ea ch ha lf now ha s 14 dat a
points. The result is in Table 2.
Looking at Figure 2, with two categories, the obviously-changing data at least are
more or less monotonic in ea ch bin . As confirmed in Ta ble 2, th e two bins, dividing
th e data exactly in h alf at a point of symmetr y, mak e th e two quadr at ic cat egories
almost ident ical. The quadra tic sta tistics all ar e lower in Table 2 th an in Table 1
becau se of th e sma ll reduction in t he domain of int egration, which elimina ted t he
largest values ofdI dx . The sam ple sta nda rd deviat ions and sta nda rd deviat ions of
the sample means in Table 2 agree reasonably closely with those in Table 1, because
we ha ve essent ially th e sam e functiona l cha nge in each "data " bin in both cases,
ma king slopes all about t he same. However, the quadra tic cur ve-fit stan dar d
deviat ions reflect th e nar rower doma in of integrat ion. If we calculat ed a stan dar d
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
12/47
J . M. Williams Chi Squar e df v. 1.23 11
deviation of th e gran d mea n in Table 2, using both bins, we would find it 2
sma ller th an in Table 1, becau se of th e doubling of th e dat a (sam ple size).
Table 2. Comparison of data vs quadratic mean s on two
categories (2 degrees of free dom)
x1: Low Half x2: High Half
x Data Qu ad x Data Qu ad
1 0.151 0.7338 0.501 0.0000
2 0.175 0.6677 0.525 0.0053
3 0.201 0.5926 0.551 0.0220
4 0.225 0.5216 0.575 0.0471
5 0.251 0.4445 0.601 0.0844
6 0.275 0.3747 0.625 0.1273
7 0.301 0.3023 0.651 0.18228 0.325 0.2396 0.675 0.2396
9 0.351 0.1777 0.701 0.3077
10 0.375 0.1273 0.725 0.3747
11 0.401 0.0811 0.751 0.4504
12 0.425 0.0471 0.775 0.5216
13 0.451 0.0203 0.801 0.5985
14 0.475 0.0053 0.825 0.6677
mean (n=14) 0.3097 I = 0.2858 mean 0.2592 I = 0.2867
sam ple sd ofI 0.2494 I = 0.2557 sd I 0.2304 I = 0.2556
sa m ple sd of m ea n 0.0667 I
= 0.0683 sd m 0.0616 I
= 0.0683
As before, ima gining th e two cat egories t o be sta tist ically indepen dent , we ma y
look for a rejection of goodness of fit in Figure 2 at the p = .001 level at ( ). .0012 2 138 .
We ma y compu te t he va lue of chi squ ar e from Ta ble 2 as follows:
( ) ( ) ( )( )
( )
( )Data
X E
V
X E
V2 1 1
2
1
2 2
2
2
2
2
2
2
3097 2858
0683
2592 2867
06830 28=
+
=
+
. .
.
. .
.. ; (7)
Again, we get nowhere nea r a value a llowing us to reject goodness of fit; in fact,
we are far th er away, becau se the slight adjustm ent of th e doma in away from th e
largest values ofI (above) ha s r emoved th e worst deviations from a per fect fit.
Two Cat egories with More Da ta
Now, supp ose we complete our experimen ta l work an d ha ve a set of some 1,000
dat a a t equa l 0.001 int ervals on t he doma in ( )x 0 1, . As in Figur e 2 an d (7) above,
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
13/47
J . M. Williams Chi Squar e df v. 1.23 12
we ha ve decided to keep just ( )x . ,.15 85 . This mean s we will keep just 700 usefuldata .
With out ta bulating da ta , the r esult of a chi squar e test on two cat egories with 350
dat a ea ch is shown in Table 2a:
Table 2a. Comparison of num erous data vs quadratic
means on tw o categories (2 degrees of freedom)
x1: Low Half x2: Hig h Half
Data Qu ad Data Qu ad
mean (n=350) 0.2831 I = 0.2858 mean 0.28520 I = 0.2867
sam ple sd ofI 0.2321 I = 0.2557 sd I 0.23290 I = 0.2556
sample sd of mean 0.01241 I
= 0.01367 sd m 0.01245 I
= 0.01367
Of cour se, the qua dra tic means an d var iances have not chan ged; however, we now
must sta nda rdize th e cat egory means by the much lar ger sample size. The result is,
( ) ( ) ( )( )
( )
( )Data
X E
V
X E
V2 1 1
2
1
2 2
2
2
2
2
2
2
2831 2858
01367
2852 2867
013670 05=
+
=
+
. .
.
. .
.. ; (7a)
an d th is is even fart her from the significan ce level of 13.8. We th us might conclude
from chi squa re on two cat egories t ha t t he fit of th e qua dra tic was good.
Ten Ca tegories
As a bove, we ha ve 700 usa ble, equa lly-spaced dat a on ( )x . ,.15 85 . Let us test the
fit of th e quad ra tic fun ction aga in, th is time dividing our x doma in int o 10
cat egories. Using "box plot" repr esent at ion, th e result , aga in a fit by eye, is in
Figur e 3. Ea ch of th e 10 bins ha s been processed as we did in th e preceding
an alyses. Again, we ha ve such pr ecision of measu remen t of th e dat a in t his
ar tificial exam ple, th at th ere is no legitima te r an domn ess.
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
14/47
J . M. Williams Chi Squar e df v. 1.23 13
Distance x
Field
Intensity
I
0.0
0.2
0.4
0.6
0.8
0.15 0.45 0.85
Mean sd of I
Mean sd of Imean
Figure 3. Ten-catego ry f i t by eye. Sum marized set of 700 of the same
arti ficial data as in Figs . 1 and 2, for approximately the s ame quadratic
fit, ( )I x= 7 1 22
.
Notice in Figure 3 t ha t t he var iance in ea ch bin clear ly increases with t he local
slope of th e cur ve. Ea ch mea n repr esents 70 data ; th e equa l-spacing of th e data
project on the local slopes to determine the standard deviations shown.
Also notice th at th e large num ber of 70 data in each bin, tr eated a s th ough
sam pled ran domly and independently, yield such a sma ll stan dar d deviat ion of th e
bin mean , tha t for only one of th e dat a, th e second from th e left, does th e quadr at ic
actua lly fall with in a sta nda rd deviation of th e mean .
To prepa re our chi squa re t est of goodness of fit of th e qua dra tic, we sum ma rize in
Table 3 the sta tistics plott ed in Figure 3:
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
15/47
J . M. Williams Chi Squar e df v. 1.23 14
Table 3. Comparison of data vs quadratic means on ten
catego ries (10 degrees of freed om)
Bin #(70 data)
Binmean
Bin sd sd o f Bin
mean
Quadmean
Quad sd sd of Quad
mean
2
e lement
1 0.63680 0.05840 0.006976 0.695150 0.087730 0.010486 30.97
2 0.43190 0.05945 0.007106 0.421240 0.068241 0.008156 1.71
3 0.24020 0.05077 0.006069 0.215930 0.048726 0.005824 17.37
4 0.09301 0.03404 0.004068 0.079220 0.029227 0.003493 15.58
5 0.01348 0.01220 0.001459 0.011110 0.009933 0.001187 3.99
6 0.01407 0.01253 0.001498 0.011599 0.010206 0.001220 4.10
7 0.09468 0.03432 0.004102 0.080689 0.029524 0.003529 15.728 0.24270 0.05096 0.006091 0.218379 0.049003 0.005857 17.24
9 0.43480 0.05901 0.007113 0.424669 0.068529 0.008191 1.53
10 0.63970 0.05828 0.006966 0.699560 0.088010 0.010519 32.38
Value of ( )Data2 10 141
But, ( ).0012 10 is about 29.5; so, now, with a lar ger sam ple and df= 10, we easily
reject th e nu ll hypoth esis and conclude t ha t t he fit by th e quadr at ic is not good to
explain t he dat a.
Fifty Categories
Before leaving th is intr oduction, we look at our nonr an dom dat a once more to see
wha t would ha ppen if we tried a chi squa re goodness of fit t est a s above, but with
th e dat a subdivided into more nu merous categories, say, 50 of them.
A priori , if we had some ra ndomn ess, we might hope tha t 50 categories would
ma ke for fewer dat a per bin an d th us lar ger varian ces of th e mean in each bin; so,
with higher dfat th e significan ce point , th e cha nce for a good fit m ight be im proved
over what it was with 10.
The r esult is plott ed in F igur e 4 a nd dr am at ically shows th e effect of systema tic
versus ran dom variat ion. With 50 cat egories and 700 data, the real shape of th edat a cur ve becomes evident ; and, quit e cont ra ry to our conclusion ba sed on Figur e 1
or 2 a bove, an d ma ybe to our first impression of Figure 3, we see tha t t he qua dra tic
isn't even close to a good fit to the da ta . Perh aps a slightly bett er qua dra tic might
ha ve been chosen by eye, but th e sha pe of th e dat a cur ve obviously prevent s chi
squa re ever from acceptin g th e qua dra tic on 50 degrees of freedom.
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
16/47
J . M. Williams Chi Squar e df v. 1.23 15
Data: I = 1 - cos(x-.5) + cos( 3(x-.5) ) - cos( 5(x-.5) )
Distance x
Field
Intensity
I
0.0
0.2
0.4
0.6
0.8
0.15 0.50 0.85
Mean sd ofI
Mean sd ofImean
Figure 4. Fifty-catego ry f i t by eye. Sum marized set of 700 arti ficial
data for the qua dratic f i t , ( )I x=
7 1 2
2
. The funct ion used to generatethe data throughout the Introduct ion i s g iven in the graph t i t le .
Tabulation of the Figure 4 statistics has been omitted; however, the significance
point is ( ).0012 50 87 ; the obtained value is comput ed at Data
2 304= ; so, again , th e chi
squar e test r eassur es us th at we should reject t he fit as n ot good.
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
17/47
J . M. Williams Chi Squar e df v. 1.23 16
II. Me an in g of Chi-Squ are
At t his point, we leave th e intr oductory examples for some form alism a bout th e
sta tist ics. Gener alizat ion of th ese forma lisms to th e mu ltidimen siona l case [4, 5] isnontrivial but rea sona bly str aightforwar d.
For m a li sm s
Relation to Gam ma Distribut ion
The factorial of an int eger n , defined a s ( ) ( )Fact ! ...n n n n nii
n
= = =
1 11
, appears in
combinat orial an alysis everywhere. The factorial ma y be expressed as a special
case of the cont inu ous gamm a fun ction . The gamma fun ction is discussed in th e
cont ext of Bayesian inferen ce in [6, Sect. 7.3 ff.]; the der ivat ions below follow th oseof Feller [7 vol II, Ch. 2]. It isn't mu ch of an overst at ement t o say tha t t he gamm a
fun ction is as funda ment al t o stat istical inference as th e exponent ial fun ction is to
th e solut ions of differen tia l equat ions.
The gamm a function is defined as,
gamma(x) ( ) =
x d ex 1
0
. (8)
Integrating (8) formally by parts,
( ) ( ) xx
d ex
xx= = +
1 1
10
; th erefore,
( ) ( ) x x x+ =1 ; an d so, (9)
( ) ( ) ( ) ( ) x x x x= 1 2 2 , (10)
which clearly shows t he r ecur sive relat ion ma king ( ) ( ) x x= 1 !, for th e special case
ofx an integer.
Using th e gamma fun ction in (8), we ma y define t he gam m a probability density
of random variable Xon ( )0, by,
( ) ( )( )
P X x x e x= = = gamma ,
1
1
. (11)
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
18/47
J . M. Williams Chi Squar e df v. 1.23 17
Looking at (11), we notice that the familiar exponential density then may be
defined a s ( )gamma , 1 ; and s o, by ana logy, in (11) we will ha ve th e mea n ( )E X =
and t he variance ( )V X = 2 .
In a sma ll digression, recall th e combin at oria l express ion for t he n um ber of ways(un order ed) of ta king a sam ple ofj objects from a discret e populat ion ofm of th em:
( )( )( )
( )( ) ( )
( ) ( )m
j
m
j m j
j m j
j m j
j n
j n
j n
j n
=
=
+
=
+=
+ +
+ +
!
! !
!
! !
!
! !
1
1 1. (12)
The discrete case ma y be genera lized to the beta integral, in t erms of th e gam ma
fun ction, sta rt ing this way:
( )( )
( ) ( )
B ,
=+
1
. (13)
Oth er u ses of th e gamm a fun ction ar e discuss ed in [4, Ch. 4] an d [7 vol II, Ch. 2].
Retur ning to th e subject, we may r edefine th e un it norm al density as in (1) above,
in th e following way: Fr om (8) an d (11), settin g to 1 2 and to 1 2 yields,
( )gamma x ed e
xe
xx1
2
1
2
1
1
2
1 21 1
2
112
1
2
1
21
2
0
2,
=
=
. (14)
The definite integra l ( )1 2 evaluates t o ; so,
( )
gammax
e
x1
2
1
2
1 1
2 12
1
2
0
1
2
2
,
=
. (15)
In th e brackets in (15), we have x in a unit norma l distribution. Becau se
( )1
0 1x
N X; , is the density of th e squar e of th e unit norma l ran dom var iableX [7
vol II, p.47], we ma y say t ha t
( )[ ]gamma N 1
2
1
20 1
2
, ,
. (16)
So, (16) mea ns t ha t ( )gamma ,1 2 1 2 represent s th e probability density of th e squar eof a ra ndom variable Xdistributed a s ( )N 0 1, . It is easy to see th e genera lizat ion of
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
19/47
J . M. Williams Chi Squar e df v. 1.23 18
(16) to a sum ofn variat es, each distr ibuted as ( )N 0 1, : Therefore, reca lling (4)above, we ma y write
( )gamman
n1
2 2
2,
; an d, (17)
th is is just th e definition of th e distr ibut ion of chi squar e.
Thus, we have used the gamma density to show th at a sum ofn squared unit
normal varia tes will be distributed a s ( )2 n . By an alogy to (11) above, th e mean of
( )2 n will be n , and th e varian ce will be 2n .
Moment Generat ing Fun ction
Not ma ny rea ders would see th e relat ion of (15) to (16) above. An ea sier way to
show the r elation bet ween th e normal a nd chi squar e probability densities is bycompar ison of th eir moment genera tin g fun ctions (mgfs).
Th e m om ent generating fun ction of a r an dom varia ble is a s eries expan sion of its
densit y which, t erm by term , yields as coefficient s th e n -th moments of th e ra ndom
variable about t he origin. The first moment is the mean , the second t he variance,
an d so fort h. What is import an t for us here, is tha t two ra ndom var iables, un der
very genera l conditions, ha ve the sa me density if and only if all th eir moments ar e
identical [6, 6.7.8; cf. 6.3.3]. And, if th ey both h ave the sam e mgf, th eir moment s all
will be identical.
The mgf of a r an dom variable X is defined [6, Sect . 7.1] for cont inu ous
distributions on ( ) , as
( ) ( ) ( )mgf ; f t X E e dx e X xtX tx= = =
, (18)
in wh ich ( )f X x= is th e probability density function ofX. The momen ts will be
obta ined by a power series expan sion on t; the int erval of series convergen ce does
not concern us here.
The rest is very simple: Fr om (14),
( )( )f X x
xe
x
2 1
21
2
1
2
1
2
1=
=
gamma , ; so, from (18),
( )( ) ( )mgf ;t dx e xx t
2 1 21
211
2=
. (19)
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
20/47
J . M. Williams Chi Squar e df v. 1.23 19
Cha nging var iables in (19) with y x= 1 2 an d fixing t he lower limit of int egrat ion
immedia tely yields, for chi squa re,
( )
( )
( )
mgf ;t dy et
y
2
1 2
2
0
11
2
2
=
. (20)
On th e oth er ha nd, the un it normal density (1) ma y be writt en,
( ) ( )f ,N
x
x e0 1
21
2
2
X = =
. (21)
So, th e mgf for X2 will be,
( )( )( )
mgf ; ,t N dx e dx etx xt
x2 2
0
1 2
2
0
0 11
2
1
2
2 22
= =
, (22)
which clear ly is th e sam e as for t he chi squa re in (20).
While on m gfs, it sh ould be mentioned tha t t he mgf also is perhaps t he easiest
way to derive the var ian ce of th e Poisson dist ribu tion [7 vol I, Sect.6.5 - 6.7] . The
Poisson distr ibut ion, wh ich gives t he (discret e) probability ofk events each with
small probability , has probability function [6, 6.3.7],
( )P k ek
k
X = =
;!
, (23)
an d may be shown to have mean = var iance = . For n repeat ed Poisson t rials, the
mean will be ( )nE X n= , and th e varian ce will be ( )nV X n= . For n repeated
tr ials, the sta nda rd deviation of th e Poisson m ean will be ( )nV X n = n n
= ; so, by the centr al limit t heorem, we would expect a Poisson dist ribu tion sum
ofn repeated trials with mean to approach a norma l distr ibution with m ean n
an d stan dar d deviation of th e mean n (= stan dar d error).
Sufficiency
A sample of a ra ndom variable may be seen a s a measu re of th e inform at ionknown about t he un derlying physical process. The sample size and th e var iance of
th e measu remen t determine the inform at ion in th e sam ple: Var iance adds unu sable
inform at ion (possible inter pret at ions of th e physical pr ocess) and so redu ces th e
precision of th e sam ple in represen tin g th e physics. Sam ple size decrea ses the
var ian ce of th e mean of th e sam ple and so increases t he pr ecision of th e sam ple in
repr esent ing the mean of th e physics. Becau se physical processes are of little value
un less repeat able, an d becau se the mea n ma y be used as an expecta tion of fut ur e
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
21/47
J . M. Williams Chi Squar e df v. 1.23 20
repetitions, size of the sample and variance of the measure work inversely in
determining th e inform at ion in a r an dom sam ple.
Given a variety of sta tistics describing a ra ndom sam ple, a sufficientstatistic [6,
Sect. 10.4] is one wh ich describes t he da ta with no loss of inform at ion, so th at
inform at ion beyond th at of the su fficient sta tist ic does n ot im prove prediction of
fut ur e samples. Given th e same ph ysical process, a set of measu rement s of sma ll
var ian ce genera lly will be sufficientto predict a set with greater var iance: The
latt er set m ay be viewed as th e form er set plus some extr a r an dom variat ion.
Gener ally, sufficiency requires choice of a sta tist ic good en ough to repr esent th e
un derlying ph ysical p rocess; oth erwise, err or in choice of sta tist ic will add
system at ic var ian ce which will redu ce predictability. An example of th is was in the
int roductory example a bove, in which t he wr ong choice of cur ve (qua dra tic) add ed
systematic error--clearly systematic, because there was no random error.
In t he present cont ext, we may see chi squa re a s a wa y to represent th e goodness
of choice of cur ve, such goodness being an un avoidable pr erequ isite t o exam ina tion
of th e ran dom err or in th e data . This goodness must be relat ive to th e cat egories of
th e data , not t o compet ing cur ves. But , in an y case, we can not look at s ufficiency
without goodness of fit.
So, how can we know tha t chi squar e might be u seful generally as a test , when,
perha ps, the chi-squar e assu mpt ion of a su m of squar es of unit n orm al deviat es
might itself be bad?
Centr al Limit Theorem
The goodness of chi squ ar e as a test st at istic for goodness of fit of rea sonably
large samples generally need not be questioned. The reason is th e central limit
theorem [7 vol II, Sect. 8.4 & Ch . 14], which h olds u nder alm ost all condit ions in
which st able appar at us produces consistent dat a:
T he sum X of any set of ind epend ent ran dom variates, regardless
of how distributed, will approach a n orm al distribution
( )N X X ,2 as the sam ple size approaches infinity.
Of cour se, that being tr ue, the sa mple mean will represent ( )E X X and the
sam ple varian ce will represent ( )V X X 2 .
As app lied to th e chi squa re quest ion, if th e dat a in ea ch cat egory fulfil therequiremen ts of th e centr al limit t heorem, the mean in th at cat egory always may be
sta nda rdized for u se as a valid element of a chi squar e sum.
Fu rt herm ore, if errors of measu rement cau sed by instr um enta tion ar e themselves
subject t o th e centr al limit t heorem, an d th erefore a re n orma lly distributed, it can
be proven [7 vol II, Sect. 2.2] th at th e convolut ion of th e instr um ent at ion er ror with
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
22/47
J . M. Williams Chi Squar e df v. 1.23 21
th e physical process also will be subject t o th e cent ra l limit t heorem, as sum ing a
reasonably consistent , repeat ably measu ra ble process.
Categor ies in Ra nd om-Sa mp le Chi Squa re
Nu mber of Cat egories
Above, it was st at ed th at a su fficient st at istic will have less variance tha n one
which is not, for a good fit an d the sam e data . Fr om (17) an d comm ent s
imm ediat ely following, as t he dfof chi squ ar e increases, its va ria nce increa ses, too.
However, t o increa se df, we cha nge th e experiment by cha nging the way th e
cat egories for avera ging the dat a ar e defined. Chi squa re represents th e cat egories,
not the data .
In t erms of the dat a, a set of mean s on ma ny cat egories (in th e limit, one cat egory
per dat um ) always may be combined to ma ke fewer categories. In th is sense, a chi
squar e of great er dfalways will be sufficient relative to one of lesser df, for a given
set of dat a. Looked at in ter ms of individua l cat egories, a wider cat egory mean s
great er likelihood of a cha nge in t he ph ysics, ma king for a bigger var ian ce, given th e
sam e set of dat a being cat egorized. So, in chi squa re test ing, more num erous,
na rr ower cat egories (alwa ys fulfilling n orm ality) will be sufficient rela tive t o fewer,
wider ones. The more th e df, th e closer to sufficiency.
Number of Par ameters
For a given da ta set, it seems obvious t ha t a dding free th eoretical para meter s will
ma ke th e fit of th e theory bett er. In t he limit, one always may fit N dat a cat egories
by an ( N 1)-th degree polynomia l, result ing in zero var ian ce in each cat egory a ndth erefore a perfectly good chi squa re fit. However, this would be very unint erest ing
physics--to claim, in t he limit, th at everyth ing was a certa in n um ber of term s of its
own, u nique polynomial.
Clearly, a physical goodness of fit somehow should involve the complexity of the
hypothesis, as well as t he cat egorizat ion of th e dat a. The issue her e is, how should
we view th e degrees of freedom in a goodness of fit t est wh en we a re free t o var y
both t he dat a categories and th e number of th eoretical par am eters? Is every theory
with n um erous par am eters an d wide-sweeping flexibility bett er t ha n one which
leaves some variance unaccounted?
Mea n ing of d f
Bock [5, Sect. 2.6.3] defines degrees of freedom in ter ms of the d imen sion of a
subspace on which th e dat a ar e projected.
Brownlee [3, Sect. 8.1] defines d egrees of freedom a s "th e n um ber of var iables
minu s th e num ber of independent linear r elations or const ra ints between th em [an d
equa l to th e degrees of freedom of th eir su m of squa res]."
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
23/47
J . M. Williams Chi Squar e df v. 1.23 22
In describing th e chi squa re a pproximat ion t o the mu ltinomial distr ibution,
Brownlee coun ts th e degrees of freedom a s t he n um ber of degrees of freedom for t he
chi squar e minus t he nu mber of par am eters estima ted in order t o perform a cur ve
fit. In an exam ple [3, Sect. 5.2], estimat ion in the sam ple ma kes the dfof th e chi
squar e equal to th e number of cat egories minus 1. He assum es a Poissondistribution for which he su btr acts a nother 1 from df, because the Poisson density
ha s one free par am eter. Thus, Brownlee's final dfis equa l to the nu mber of
categories minus 2.
If Brownlee's appr oach wer e a pplied to our problem of coun tin g df, we would u se
th e form ula,
( )df N N C P= 1 , (24)
in wh ich NC is nu mber of cat egories an d NP is nu mber of free par am eters. At least
one popula r st at istical softwa re pa cka ge uses form ula (24) for chi squ ar e goodness offit.
For exam ple, consider th e ar tificial dat a in F igure 4b, an d th e effect of sequen tia lly
fitt ing th em with a line of slope b an d then with a const an t function a , each t ime
subtr acting away the fit from the dat a:
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
24/47
J . M. Williams Chi Squar e df v. 1.23 23
(a) Data( x )
x
0.00
0.25
0.50
0.75
1.00
0 2 4 6 8 10 12 14
(b) Data(x ) and fit of slope b = 0.058
x
0.00
0.25
0.50
0.75
1.00
0 2 4 6 8 10 12 14
f1
(x) = 0.058 x
(c) Data( x ) - f1(x) and fit of a = 0.112
x
0.00
0.25
0.50
0.75
1.00
0 2 4 6 8 10 12 14
f2
(x ) = 0.112
(d) Data(x) - f1
(x) - f2
(x )
x
-0.25
0.00
0.25
0.50
0.75
0 2 4 6 8 10 12 14
Figure 4b. Effect of l inear parame ters on residuals . The data we re
fit by f(x) = a +bx = 0.112 + 0.058x before the success ive su btract ions .
Using these two para meters a and b in a cur ve fit in some sens e does redu ce th eort hogona lity of th e remainder (residuals). The smaller vertical ra nge means t he
varian ce was reduced, too. With 12 data an d 2 par am eters t o fit, it would seem
reasonable in th is exam ple to use the r ule above to test goodness u sing ( )p2 9 , with
dfcomputed as 9 = (12 - 1) - 2. For exam ple, in Figur e 4b, if theory forced a = 0.10,
th en only b would be fit, a llowing a test on ( )p2 10 , and possibly ma king a ccepta nce
of th e fit more likely for th e same dat a. Clearly, in such an exam ple, th e accepta nce
would be m ore likely for a th eoret ical valu e ofa equal t o its value best fit to th e
dat a. So, a "str onger" th eory which predicted par am eters a ccur at ely without dat a
would get preferen ce in goodness. The Galahad Effect, one m ight sa y, in wh ich a
cur ve gains st rength becau se its th eory is pure.
The problem with th e appr oach of simply subtra cting t he n um ber of free
par am eters is th at it tr eats everything as linear a nd ort hogona l. It gives no more
import an ce to a pa ra meter of the curve tha n t o just one of th e possibly num erous
cat egories. This appr oach seems unqu estionably valid when the cur ve to be fit is
linear. In other cases, although the cat egories might be linearly independent a nd
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
25/47
J . M. Williams Chi Squar e df v. 1.23 24
cont ribute equa lly to the r an dom var iance removed by th e fit, const ra ints on t he
par am eters in genera l will int roduce nonlinear const ra ints on th e cat egory mean s.
Anoth er, en tir ely differen t problem, is th e efficiency of the estim at or us ed for a
par am eter to be fit. Sur ely, we should subtr act a dfwhen t esting a polynomial
hypoth esis, if we ar e fitting an intercept para meter u sing th e data mea n. But, what
if we ar e using t he less efficient, nonpar am etric stat istic, th e median, inst ead of th e
mean ? The median only accoun ts for th e ran k order of th e data a nd is invar iant
un der an y strictly monotonic tr an sform of th e data . Should we subtr act 1 2 dfif we
use th e median of th e data to estima te th e hypoth etical intercept?
For a t heorist interest ed in an objective test of a h ypoth esis with a small nu mber
rof free pa ra met ers, t estin g by (24) above on a lar ge nu mber of cat egories, sa y 10r
or more, essentially ignores th e nu mber of par am eters. Yet, under the null
hypoth esis that th e data ar e well fit by the cur ve, th e data in some sense should be
expected t o be interdependent in a way similar to the wa y they respond to small
chan ges in th e para meters.
Each parameter is critical, to an h ypoth esis with two par am eters to fit. With 100
cat egories of dat a, th ough , dropping ten categories might n ot ma ke mu ch differen ce
in t he conclusion.
Need for a Cor r ect ion t o d f
The pr acticality of the pr oblem ma y be seen by compar ing two hypoth etical
cur ves as shown h ere, one a sine
cur ve with 3 free par am eters
(vertical displacement, phase, andfrequency) and t he other a 2-term
polynomial (cons ta nt + coefficient
of square).
Sup pose we were a llowed to
adjust the param eters shown. If
th e data ra nged only over a 1,
th e sine fit might be bett er th an
th at of the polynomial; with th e
dat a a s shown, a good sine fit
might be hopeless. By th e ru leabove, we would u se a more
str ingent test (more likely to reject goodness) for t he sin e cur ve tha n for t he
polynomial.
We note t ha t a dding a free para meter to the sine (say, amplitude) or a couple of
more ter ms to the polynomia l, might m ak e the fit good in both cases. Alas! In th e
cont ext of a th eory, man y par am eters will be const ra ined by the physics a nd
x
-1.00
-0.50
0.00
0.50
1.00
1.50
2.00
2.50
0.0 0.3 0.5 0.8 1.0
Data(x)
a + cx2
a + sin( bx + c )
Figure 4a. Two il l-f it curve s.
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
26/47
J . M. Williams Chi Squar e df v. 1.23 25
un available for ar bitra ry fitting. The problem of th e present pa per is th at we
can not know in advan ce what th e relation will be, for t hose tha t r emain free to fit.
Clear ly, ignoring th e nu mber of par am eters in a goodness of fit t est will make t he
test value of chi squa re such t ha t df df test categor ies= , so th at , for t he sa me curve fit an d
dat a, a ccepta nce of the n ull hypoth esis (th at th e fit is good) will be favored.
However, as above, merely subtra cting th e nu mber of par am eters from ( Ncategories 1)
would seem likely t o under corr ect dfexcept in th e special case of linear regress ion.
We seek a wa y of corr ectin g th e dfin a wa y equally fair , in some sen se, to all
curves to be fit.
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
27/47
J . M. Williams Chi Squar e df v. 1.23 26
III. The P aram ete r Correc tion
Defin i t ion of Cor r ect ion
Development and Rationale
Before st ar ting, we emphasize tha t we do not addr ess th e question of
cat egorization of th e data : Pr esuma bly, an optimum categorization a lready has
been achieved, or perh aps th e problem may not be flexible in these ter ms. We
accept t he categories of the da ta set a s a given.
We also accept a s fixed a priori, the level of significance p to be requ ired for a
rejection of goodness of fit.
A clue to th e solution su ggested her e ma y be seen in th e example in Figure 4babove: In th at example, th e data residua ls were just as ran dom as th e origina l dat a
un less fit by a stra ight line. If th e residuals were fit by a stra ight line, the slope
would be 0 and t he int ercept would be 0, becau se th ese ort hogona l componen ts wer e
th e ones subt ra cted down to 0. But, th ey were the only ones subt ra cted away, and
no oth er stru ctu re in the dat a was affected. If we wished to test a linear fit to the
residuals by chi squa re, we would subt ra ct a way df= 2; however, we assu me h ere
th at to test a tr igonometric fit, we would subt ra ct a way nothing from th e df. Even
th ough t he varian ce of th e residuals in Figure 4b was less tha n t ha t of th e origina l
dat a, th e residuals rema in just a s dimensionful as th e data , except as projected on
th e intercept or t he slope of a linear basis set.
Given cat egories C, NC in n um ber, we consider a set of ar bitrar y cur ves K to fit,
each cur ve itself with a n a rbitra ry an d probably not u nique set of par am eters NP in
nu mber . All such cur ves ar e ass um ed single-valued on th e doma in of th e cat egories
of dat a. The quest ion is how to coun t th e dfto be used in a chi squa re test. This
quest ion is equivalent to th e one of coun tin g how ma ny dfshould be subtracted from
th e number of dat a categories. We know th at for rea listic problems we mu st ha ve
N df NP C< < .
The a nswer we propose her e is to accoun t for t he diversit y of possible par am eter s
by ignorin g th eir num ber an d consider ing only their effect. We th erefore propose to
countth e degrees of freedom in th e dat a by measu ring th e ort hogona lity of th e dat aafter t he curve has been fit.
Here is how we prepare th is quant ificat ion: For data on ra ndom var iableXwith
domain { }x , described categorized by some function, { }Data x , we can compu te t he
total varian ce in the da ta cat egories, ( )V Datacat . We choose th e curve K to be fit, fit
it, subtra ct it, and th en exam ine the residual var iance, ( )V Data K res , .
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
28/47
J . M. Williams Chi Squar e df v. 1.23 27
Under th e null hypoth esis tha t t he fit was good, the residua ls should represent
solely ra ndom var iat ion; th ey of cour se ma y not be as sum ed constr ain ed by K, K
ha ving been removed. Under th e nu ll hypoth esis, we assu me here th at th e
ort hogona lity of th ese residu als m ay be seen as being equa l to th e degrees of
freedom of th e cur ve fit t o th e origina l dat a, except wh en consider ed in r egard t o afit by K itself.
Coun tin g Pr ocess
So, having r emoved t he effect of th e curve K, we propose to count degrees of
freedom by r emoving furth er var iance by fitting t he r esiduals with th e sum of a set
( ){ }L L x of linear ly ort hogona l fun ctions , leaving us with a n ew residua l var ian ce,
( )V Data K Lres , , . Sta tist ical significan ce of th e sum will be test ed after a dding each
term ofL to it; any term removing statistically significant variance will be omitted
from L .
The size of th e set L th en will define th e dfof the chi square t o be used in t esting
goodness of fit. We see no reas on n ot t o use a n ort hogona l set of polynomial
functions; because we will enumerate terms in L beginning with L0 , we call t he s ize,
n df+ =1 , the nomial of th e fit:
( ) ( ) ( ) ( )df Data K L n L n R Data K L L xcorrected jj
n
= + ==
nomial , , , , ,1 00
s. t. es , (25)
in which t he j ta ke on values of th e degree ofL not foun d sta tist ically significan t.
To see th e logic of this, if only a few terms in L were required to reduce theresidual varian ce ( )V Data K Lres , , below some th resh old , then th e residuals must
ha ve had low-order str uctur e which was m issed by K; so, th e dat a cat egories were
not orth ogona l in view of th e th eory an d indeed sh ould be test ed at lower df. On
th e oth er ha nd, if nu merous term s were required in L to reduce the r esidua ls below
, the r esiduals must ha ve had enough ort hogona lity to be tested at high df.
St opping Cr iterion
The one rema ining pr oblem is the st opping crit erion: How good a fit should we
requ ire of th e ort hogona l fun ctions ( ){ }L x , none of which being significant?
We find an an swer by again referring to the n ull hypoth esis tha t t he fit is good:
Under th is hypoth esis, we assume t ha t t he initial cur ve-fit by Kmu st ha ve removed
a st at istically significan t a moun t of var ian ce; if so, ( )V Datacat > ( )V Data K res , ; in fact,we may write this in para metric terms a s,
( ) ( )V Data F V Data K cat p res , , (26)
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
29/47
J . M. Williams Chi Squar e df v. 1.23 28
in wh ich Fp is variance rat io at th e one-tailed p-level for t he F isher an alysis of
variance F-test. So, to calculat e where to term inat e th e nomial, we suggest a pplying
th is criterion:
( ) ( ){ } ( ) ( )nomial , , , , , ,maxData K L Min n L V Data K L F V Data K Lres p res n 1 0s. t. . (27)
This just sets a t hr eshold which pr events th e creat ion of nonexistent dfby the
ort hogona l fun ction fitt ing process: If th e n -th polynomial leaves th e n -th residuals
with significan tly less variance tha n t he original 0-th residuals, th en a ll degrees of
freedom (beyond th e st ru ctu re evident ly overlooked by K in the dat a an d now foun d
in th e original r esidua ls) ha ve been discovered.
A similar rationale is used sometimes in factor analysis (as distinguished from
principle components an alysis). It should be ment ioned tha t th e dffor t he
significance level of F is not a difficult issue h ere, becau se we ar e looking a t
residuals a fter our problematical cur ve Kha s been fit and gone. The df for
computing ( )V Data K Lres n, , from a sum of squa res of deviations will be just t he dfofthe data , ( NC 1), minu s the polynomial const ra ints ( n + 1 ). This is becau se, in both
cases, the r esidual variance will have the dimensiona lity of th e dat a u nder t he nu ll
hypoth esis tha t t he fit ofKwas good an d th erefore t ha t L will be in effective in
explaining the data.
So, we ha ve descr ibed a pr ocess to redu ce an ar b i t ra ry problem in
coun t in g p h ysica l p a r a m et er s a s d f, t o one of cou n t in g l inea r ly in d ep end en t
p olynom ia l p a r a m et er s a s d f.
The Corr ection Pr ocedur e
In the following, ( )Vres refers to a variance between categories and is computed
from a sum of squar es by dividing by a dfdependin g on t he nu mber of cat egories
an d their linear in dependence. The sample var iance in the origina l dat a, as
categorized, is ( )V Datacat an d is a varian ce am ong t he category m eans; in cont ext of
an alysis of varian ce, this is called th e "between tr eatm ents" varian ce.
We resta te t he pr oblem: To perform a chi squa re goodness of fit test in th e
cont ext of a ph ysics experiment, a ssum ing predeterm ined dat a cat egories Cand
significance levelp , the degrees of freedom dfma y have to be adjusted for t he free
par am eters in the cur ve to be fit. The proposed an swer is:
1. Fit th e cur ve K to th e cat egorized dat a. Record th e sum of squa res of th e
residu als (differen ces or r ema inder s) after t he fit, stan dar dizing each differen ce by
th e var ian ce in its cat egory. The resu lt will be th e valu e of chi squa re for th e fit:
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
30/47
J . M. Williams Chi Squar e df v. 1.23 29
( )Data j
j
Nj j
jj
Nj j
jj
N
SSX K
V
X K
sd
C C C
2
1
2
1
2
1
= =
=
= = =
. (28)
The stan dar d deviation sd here is the st an dar d deviation of th e j-th cat egory mea n,also called the st an dar d err or, which sh ould be estimat ed from th e origina l data
before th e fit by K. Note th at t he cat egory variance corr esponding to chi squar e,
which will not be u sed in t he goodness of fit t est, is obtain ed by dividing chi squa re
by th e df: ( )V Data K df res Data, = 2 , in which we a pproximat e th e dfconventionally by
N NC P 1 .
IfK is a polynomial, count the number NP of free pa ra meters in t he polynomial
an d comput e df= ( NC 1) NP : This immediat ely is th e dfsought. IfK includes
polynomial free parameters (a constant offset, for example), subtract the number of
th em before per form ing th e goodness of fit t est.Also, esta blish a st opping criterion by looking up or calculat ing th e Fish er
varian ce ra tio value Ffor one-ta iled rejection of a n ull hypoth esis tes t ofdfof
( )V Datacat vs. dfof ( )V Data K res , at significan ce levelp . This $Fp ma y be foun d in
sta tist ics references in th e cont ext of an alysis of var ian ce. The significan ce test will
not a ctu ally be perform ed; however it s dfwill set $Fp for th e stopping crit erion. One
dfwill be th at of th e data , NC 1; th e oth er will be th at of th e data with const ra ints
of th e curve fit Kuncorrected, N NC P 1 . So, find,
( ) ( )( )F df df F p V Data V Data K p,
$
, . (29)
Note th at in general Kwill ha ve removed consider able var ian ce from th e dat a, so
an immediate Frat io test with,
( )
( )F
V Data
V Data K
cat
res
=,
, (30)
easily might exceed $Fp . This will be irr elevant , becau se (a) we ar e not merely
testing whether Ksays someth ing about t he dat a; and, (b) we can not assum e tha t
th e origina l var ian ce over th e whole set of cat egories was const an t.
2. One sum a t a t ime, progressively fit th e residual data with th e sum of th e
ter ms from an ort hogona l polynomia l set ( ){ }L x an d comput e the new residual
varian ce from tha t fit, ( )V Data K Lres , , . When th e nomial equals NC 1, st op. Or,stop when t he st opping crit erion is invoked: This will be when t he ra tio of var ian ces
am ong elements ofL no longer is below $Fp , na mely, when
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
31/47
J . M. Williams Chi Squar e df v. 1.23 30
( )( )
FV Data K L
V Data K LF
res
res n
p= , ,
, ,$
0. (31)
For everyL yielding a statistically significant fit, there was a linear dependence, sosubt ra ct 1 from t he n omia l for ea ch su ch fit.
To compu te t he first an d all subsequent values of ( )V Data K Lres , , , compu te t he onlyvarian ce present, na mely that between t he cat egories:
( )V Data K Lres n, , =
1
1dfresj
j
NC
S S , (32)
with ( )df N nres C = +1 1 , for Ln the highest-degree term in a polynomial ofn -th
degree.
3. Fina lly, to test t he goodness of th e fit, perform a st an dar d chi squar e test with
dfequal to th e nomial. In cases not requiring a corr ection, this dfwill equa l NC 1 .
Step 2 m ay be perform ed expeditiously by doing a pa ra met ric multiple regress ion
of the residuals ( )V Data K res , on a convenien t set of ort hogona l polynomials. AChebyshev set of coefficient s, or a ny of severa l other s, ma y be foun d in [8] and used
with P C softwar e.
Ap p l ica t ion in S yn th et ic Exa m p les
The value of using synthet ic examples is th at th e un derlying, corr ect fun ctiona lform of th e data will be known. Thu s, we can pr edict how a good chi squa re
goodness of fit t est should beh ave.
To insist on st at ing th e obvious, in th e following examples, compa red with th e
int roductory ones above, we ha ve a qua litat ively differen t set of dat a. The next
dat a below ar e simulat ed experiment al data , with real varian ce. Uniformly
distributed ra ndom numbers were used to add th e var iance.
In th e following exam ples, we sha ll assum e th e centr al limit t heorem so as t o be
able to tr eat each bin mean as norma lly distribut ed. In th e first example below, we
sha ll be comput ing th e sta nda rd deviation in each bin u sing 14 -1 degrees of
freedom; one degree is lost becau se th e bin mea n was estima ted from t he sa me 14
dat a. Knowing each bin mean and sta nda rd deviation, we sha ll comput e th e
sta nda rd deviat ion of each data bin mean (stan dar d error of th e mean) and u se it,
not t he corr esponding th eoretical fit pa ra meter, t o stan dar dize each element of th e
chi squar e sum.
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
32/47
J . M. Williams Chi Squar e df v. 1.23 31
Fifty Cat egories in Ran dom Ch i Squar e
Quadr at ic Fit
Recall from F igure 4 th at th e qua dra tic fit with 50 cat egories visibly was notgood. Let's look at t he same dat a cat egories, but with th e data r an domized by
add ition of some va ria nce.
Distance x
Field
IntensityI
-0.4
-0.2
0.0
0.2
0.4
0.6
0.8
1.0
1.2
0.15 0.50 0.85
Mean sd of I
Mean sd of Imean
Bad Line:
I = 7(x -1/2)
Cubic Fit:
I = 25| x -1/2|3
Quadrat ic Fit:
I = 7(x -1/2)2
Quartic Fit:
I = 70(x -1/2)4
Figu re 5. Fifty-catego ry fits by eye. The 700 artificial data of Fig. 4
here are added som e random error and again f it wi th the same
categorized quadratic, ( ) ( )I a x b x= = 22
7 1 2 . A purpose ly bad l inear
fi t also is show n, as is the old cubic f i t by eye from Figure 1 above an d a
new quart ic f i t .
Computing Data2 as in (5) or (28) above yields a valu e of about 35.8 for t he s olid
line in F igur e 5, which r epresents t he qua dra tic cur ve. We know from published
ta bles or calculat ion th at ( ).0012 13 = 34.5 an d ( ).001
2 14 = 36.1. We find in th e tables,
also, th at our st opping crit erion will be ( )F. , .001 49 47 250= . The quadr at ic ha s just two
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
33/47
J . M. Williams Chi Squar e df v. 1.23 32
polynomial par am eter s, so our corr ection should allow us imm ediat ely to assign a
corrected dfof ( NC 1) - 2 = 47; however, we proceed with t he complete correction
an alysis, to show how it work s.
If we compu ted our correction t o df= 14 or a bove, and if no ter m wa s significan t,
an d no residual reached the L Ln0 var ian ce ra tio stopping crit erion of 2.50, we
would have included sufficient dfto know we could a ccept t he n ull hypothesis t ha t
th e fit was good.
The resu lts of th ese compu ta tions ar e in Table 5. To redu ce possible err ors on
th is first real t rial of th e corr ection, th e ta ble was const ru cted by repeat edly fitting
nonorth ogona l polynomials with t erm s of th e form, a xnn , as described for t he
correction above. The resu lt was verified by repea tedly run nin g mu ltiple regression
on a n in crea sing nu mber of ter ms of th e Chebyshev ort hogona l polynomia ls of th e
first k ind. In pr actice, of cour se, only the regr ession would be necessar y.
The low, relatively unchanging $F ra tios for t he lowest-order nomials in Ta ble 5ar e a good indicat ion t ha t we should test a t h igh df.
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
34/47
J . M. Williams Chi Squar e df v. 1.23 33
Table 5. Correcte d d fcalculations for the 50-category qu adratic in
Fig. 5 . All but the rightm ost two columns we re done by curve-fi t with
nonorthogonal a xnn polyn omia ls, term by term . The "MR"column s
we re done by mult iple regress ion on Chebyshev polynomials .
Ra w
Nom-
ia l
DescriptionPoly
Residual
SS
Poly
Residual
Variance
$F
MR
Residual
SS
MR
Residual
Variance
- Data2
= 35.82 - - - -
0 ( )V Data K Lres , , 0 0.4725 0.00964 - 0.4725 0.00964
1 ( )V Data K Lres , , 1 0.4695 0.00978 0.985 0.4695 0.00978
2 ( )V Data K Lres , , 2 0.4130 0.00879 1.097 0.4128 0.00878
3 ( )V Data K Lres , , 3 0.4127 0.00897 1.075 0.4127 0.00897
4 ( )V Data K Lres , , 4 0.4082 0.00907 1.063 0.4079 0.00906
8 ( )V Data K Lres , , 8 0.3643 0.00888 1.085 0.3613 0.00881
20 ( )V Data K Lres , , 20 0.3597 0.01240 0.778 (n ot don e) (n ot don e)
None of th e fits wa s significan t, a nd a ll compu ted va lues of $F sta yed well below
2.50, so we ar e justified in t estin g at ( )p2 20 or perh aps higher df. We kn ow
alrea dy that we can not reject t he nu ll hypoth esis at df = 14, so an y higher dfsurely
will yield a sta tist ic sufficient t o th e sam e conclusion . Fu rt her a na lysis is not
requ ired, an d accordin g to our corr ected chi squa re t est, we accept t he fit of th e
quadratic as good.
It sh ould be ment ioned her e th at a least -squar es or other similar fit of a 20-term
polynomial techn ically is not tr ivial, even on a fast compu ter . It would be ra th er
amazing that such a polynomial really could be fit optimally to a set of data small
enough to appr oach the num ber of term s being fit. But, we assu me the best and
look t o th e fut ur e for better .
Linear Fit
Table 5a sh ows th e sam e test as Ta ble 5, but for t he obviously bad fit of a st ra ight
line passing on a st eep slope thr ough t he middle of th e data (Figure 5). Again, the
two polynomial parameters should allow immediate correction to df= 47; we
cont inue an yway, for demonst ra tion pur poses.
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
35/47
J . M. Williams Chi Squar e df v. 1.23 34
Table 5a. Computation s as in Table 5 for corrected ch i square
goodne ss of fi t test of the bad ( ) ( )I a x b x= = 7 1 2 l ine in Fig. 5. Thestop on the third row makes the n omial = 1; how ever , the remainder i s
supplied to show the result of a bad f i t . Terms s ignificant in regression
are shown in parenthe ses .
Ra w
Nomial
Descr ipt ion Residua l
SS
Residual Variance$F
- Data2
= 106.85 - -
0 ( )V Data K Lres , , 0 101.58 2.073 -
1 ( )V Data K Lres , , 1 2.8708 (*L 0,L 1) 0.05981 34.7
2 ( )V Data K Lres , , 2 0.4128 (*L0-L 2) 0.00878 236.1
3 ( )V Data K Lres , , 3 0.4127 (*L 0,L 1) 0.00897 231.1
4 ( )V Data K Lres , , 4 0.4079 0.009064 228.7
8 ( )V Data K Lres , , 8 0.3616 0.008819 235.1
12 ( )V Data K Lres , , 12 0.3607 0.009749 212.1
As before, ( )F.
, .001
49 47 250= . But, as obvious in Table 5a, the bad linear fit has
left t oo mu ch st ru ctu re un accoun ted an d push es $F immediately out to 34.7, mak ing
th e nomial 1; this would imply testing a gainst ( ). .0012 1 108= . In an y case, becau se
( ).0012 49 is only about 85, th e Data value of 106.85 quickly allows us to reject the fit
of the bad line as not good no mat ter what th e df.
Cubic Fit
Once more, we look at a differen t curve fit to the da ta in Figur e 5: This tim e, it is
th e cubic from Figur e 1, replotted in F igure 5 an d origina lly rejected a s a fit to the
dat a before ran dom variat ion ha d been added. We again proceed with the ana lysis,
alth ough t wo polynomial para meters permit an immediate corr ection t o df= 47.
The raw Data2 for t he cub ic fit is foun d t o be 64.25; because ( ).001
2 49 is about 85.3,
th e dat a va lue from th e cubic might a llow th e fit t o be rejected, if our corr ection
were to reduce th e dfby enough. Fr om published ta bles or calculat ion, we find th at
( ). .0012 33 63 9= and ( ). .001
2 34 65 2= ; so, if we ret ain a corr ected dfof less t ha n 34, we
will reject the fit of the cub ic as not good.
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
36/47
J . M. Williams Chi Squar e df v. 1.23 35
As before, the correction stopping criterion will be ( )F. , .001 49 47 250= . The resultsof th e an alysis ar e in Table 5b.
Table 5b. Computation s as in Table 5 for corrected ch i square
goodn ess of f it test of the cubic ( )I a x b x= = 33
25 1 2 in Fig . 5.
Terms s igni ficant in regress ion are show n in parentheses .
Ra w
Nomial
Descr ipt ion Residua l
SS
Residual Variance$F
- Data2
= 64.25 - -
0 ( )V Data K Lres , , 0 0.834828 0.017037 -
1 ( )V Data K Lres , , 1 0.780223 0.016255 1.0482 ( )V Data K Lres , , 2 0.479143 (*L 0-L 2) 0.010195 1.671
3 ( )V Data K Lres , , 3 0.476474 0.010358 1.645
4 ( )V Data K Lres , , 4 0.395849 0.008797 1.937
5 ( )V Data K Lres , , 5 0.394304 0.008961 1.901
7 ( )V Data K Lres , , 7 0.364182 0.008671 1.965
8 ( )V Data K Lres , , 8 0.360949 0.008804 1.935
12 ( )V Data K Lres , , 12 0.360192 0.009735 1.750
20 ( )V Data K Lres , , 20 0.352694 0.012162 1.400
Alth ough Ta ble 5b omit s some result s, th e value of $F wandered up to the
ma ximu m sh own a t a nomial of 7, th en declined gra dua lly to the n omial 20 value
shown; this was the limit of th e aut hor's PC softwar e. There is no sign th e stopping
crit erion would be invoked, an d only one n omia l ter m wa s dr opped becau se of
significan ce, so the corr ected chi squa re dfwould be NC =1 1 2 46 , the fina lredu ction by two being becau se the fit was a t wo-par am eter polynomia l. We repea t
that the immediate correction, because of the two polynomial terms, by subtraction
of 2 from 49, would leave a proper corrected dfof 47; in effect, we pr eten ded t her e
was at least one nonpolynomial para meter t o test t he procedure. But, 46 34>> still
mean s th at th e cubic fit to th e Figure 5 da ta would be accepted as good.
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
37/47
J . M. Williams Chi Squar e df v. 1.23 36
Quar tic Fit
Let's look once again a t a cur ve fit to th e data in Figure 5: This time, it is the
qua rt ic, which wa s fit by eye.
The raw Data2 for th e quar tic fit is foun d to be 83.01. Fr om published ta bles or
calculat ion, we find t ha t ( ). .0012 47 82 7= and ( ). .001
2 48 84 04= ; so, if we ret ain a
corrected dfof less tha n 48, we will reject t he fit of th e quar tic as n ot good. The
common practice in (24) above of setting dfat N NC P 1 , or our str ictly app lied
correction procedure, would assign df= 47, cau sing an imm ediat e rejection of th e fit.
We pretend th ere is at least one nonpolynomial par am eter a nd pur sue our corr ection
here t o see how it perform s in t his int eresting ma rginal case.
As before, the correction stopping criterion will be ( )F. , .001 49 47 250= . The resultsof th e corr ection a na lysis is in Table 5c.
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
38/47
J . M. Williams Chi Squar e df v. 1.23 37
Table 5c. Computation s for corrected chi square goodne ss of f i t test
of the quartic ( ) ( )I a x b x= = 44
70 1 2 in Fig. 5 . Terms s ignif icant in
regress ion are show n in parentheses .
Ra w
Nomial
Descr ipt ion Residua l
SS
Residual Variance$F
- Data2
= 83.01 - -
0 ( )V Data K Lres , , 0 0.824809 (*L 0) 0.016833 -
1 ( )V Data K Lres , , 1 0.766592 0.015971 1.054
2
( )V Data K L
res
, ,2
0.658320 0.014007 1.202
3 ( )V Data K Lres , , 3 0.650998 0.014152 1.189
4 ( )V Data K Lres , , 4 0.407871 (*L 0-L 4) 0.009064 1.857
5 ( )V Data K Lres , , 5 0.406808 0.009246 1.821
6 ( )V Data K Lres , , 6 0.383786 0.008925 1.886
7 ( )V Data K Lres , , 7 0.366298 0.008721 1.930
8 ( )V Data K Lres , , 8 0.361569 0.008819 1.909
12 ( )V Data K Lres , , 12 0.360525 0.009744 1.728
20 ( )V Data K Lres , , 20 0.352887 0.012169 1.383
As sh own in Table 5c, th e value of $F wandered up to a ma ximum again at a
nomial of 7, but t he st opping crit erion was not invoked. However, th e low-degree
an alysis was enough to reveal two terms of significan t residuals; the pr etend-
corr ected chi squa re dfwould be NC =1 2 2 45 , and t he quar tic fit to the Figure
5 dat a is r ejected a s not good. Actu ally, we of cour se would h ave rejected t he fit at49 - 2 = 47 df, becau se th ere were just t wo polynomial para meters an d no oth ers.
Fina lly, before leaving th is exam ple, we should rema rk t ha t t he sam e quadr at ic
function wh ich failed misera bly in F igure 4 ea sily was s hown a good fit in Figur e 5.
The dispersion in the Figur e 5 dat a clearly weakened the test. In fact, even th e
cubic of Figur e 1 was a ccepted a s a good fit. When t he ra ndomn ess is grea t enough,
alm ost a ny fit will be good en ough --but , not m uch confidence in t he h ypoth esis will
be added by not rejectin g it, eith er.
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
39/47
J . M. Williams Chi Squar e df v. 1.23 38
We consider next a more realistic simulated example.
Applicat ion in a Twenty Cat egory Decay Simu lat ion
Assume t here is a t heory which predicts t ha t a certa in ma teria l, initially
tr an slucent, will fluoresce if illumina ted with light at a certa in, short wa velength.Let's consider a h ypoth etical experim ent in which some of th is mat erial is enclosed
in a spherical detector, a glowball, and irradiated with a pulse of short-wavelength
light . We wish to predict the tim e cour se of th e glow inside th e sphere.
The hypoth esis is tha t a 4 Pla nckian detector will respond a s th e convolut ion of
th e Gaussian light exciting impulse with a decaying negative exponent ial of the
form,
( )( ) ( )
E t w a I I d e e
tw t a; , | 0 0
0
1
2
2
=
, (33)
which h as t wo free par am eters, pulse width w an d time-const an t a, the initial
intensity I0 being fixed by theory.
The resu lt of th e simulat ed experiment is given in Figure 6.
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
40/47
J . M. Williams Chi Squar e df v. 1.23 39
time t
Energy
E
0
5
10
15
20
25
30
0 5 10 15 20
Mean sd of E
Mean sd of Emean
Figure 6. Glowb all decay expe riment. Time t and energy E are in
arbitrary un its . The sol id line represe nts a fi t of formula (33) above,
wi th I0 = 8 f ixed by the ory and w = 2.9 an d a = 17 f i t by eye to the 20 bin
means s hown.
Fr om published ta bles or calculat ion, we find th at ( ). .0012 19 438= and
( )F. , .001 19 17 4 81= . Computing Data2 , we get over 75. Ther efore, th e null hypoth esis
of a good fit is r ejected at once.
The analysis, although unnecessary for this particular decision, is given in
Table 6. Looking at t he several low-degree valu es ofL in t he t able, it is obvious
th ey are chan ging slowly, an d, clear ly, after compu tin g L2 , it would be very
reasonable to conclude t ha t a test at df= NC 1 1 was just ified.
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
41/47
J . M. Williams Chi Squar e df v. 1.23 40
Table 6. Glow ball decay expe riment. Computation s for correcte d
chi square goodne ss of fi t o f the c urve in Fig . 6, which w as text equat ion
(33) f i t by eye . Two free paramete rs , w an d a , were avai lable to f i t .
Terms in parenthes i s w ere s igni f icant in regress ion.
Ra w
NomialDescription
Residual
SS
Residual
Variance$F
- Data2
= 75.7 - -
0 ( )V Data K Lres , , 0 9.551 (*L 0) 0.503 -
1 ( )V Data K Lres , , 1 9.551 0.531 0.947
2 ( )V Data K Lres , , 2 7.631 0.449 1.120
3 ( )V Data K Lres , , 3 6.493 0.406 1.239
4 ( )V Data K Lres , , 4 6.226 0.415 1.212
8 ( )V Data K Lres , , 8 5.432 0.494 1.018
12 ( )V Data K Lres , , 12 5.559 0.794 0.634
None of th e compu ted va lues of $F exceeded even 2, alt hough t he L0 term would
not be coun ted; so, if the initia l value of 75.7 for chi squa re ha d n ot been so obvious,we would have performed the corrected test against ( ) ( ) . .001
2
001
21 1 18NC = .
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
42/47
J . M. Williams Chi Squar e df v. 1.23 41
But wha t if th eory allowed us t o adjust t he th ird para meter , I0 , for a bett er fit?
The result of a t hr ee-par am eter fit is shown in Figure 7 an d is ana lyzed in Ta ble 7:
t ime t
Energy
E
0
5
10
15
20
25
30
0 5 10 15 20
Mean sd ofE
Mean sd ofEmean
Figure 7. Glow ball decay expe rimen t. Analysis w ith three freeparameters . Time t and energy E are in arbitrary units . The sol id l ine
represe nts a f i t of formula (33) above, with I0 = 8.2, w = 2.95, an d a = 1 7
as f i t by eye a gain to the sa me 20 bin mea ns as in Fig. 6 .
As sh own below in Table 7, Data2 , with an additiona l free par am eter, now is just
34.9, which is below ( ). .0012 19 438= . Ther efore, assum ing a corr ection for free
par am eter s, no immedia te decision can be ma de. Becau se th e lowest dfallowing
accepta nce of the fit is 14 ( ( ). .0012 14 361= ), we seek a decision a s to wheth er t he df
might be 14 or more.
The corr ection an alysis is done in Table 7. In t ha t Table, th e wandering of th e
residual va rian ce an d t he lack of consistent increase of $F make it quite certain t hat
th e critical Fvalue of 4.81 never will be reached. Ther efore, th e fit of th e theory
using th ree free par am eters m ay be accepted as good.
Compar e the a ppeara nce of th e two fits in Figures 6 and 7: The eye can ha rdly
decide, whereas th e sta tistics a re u nequivocal, given t he category set an d th e
significance level.
-
8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses
43/47
J . M. Williams Chi Squar e df v. 1.23 42
Table 7. Glow ball decay data as in Table 6. Computations for
corrected ch i square goodness of f i t of the curve in Fig. 7 . Three
parameters ,I0, w , and a , were free to f i t .
Ra w
Nomial
Descr ipt ion Residua l
SS
Residual
Variance$F
- Data2
= 34.97 - -
0 ( )V Data K Lres , , 0 9.179 0.4831 -
1 ( )V Data K Lres , , 1 9.173 0.5096 0.948
2 ( )V Data K Lres , , 2 8.868 0.5216 0.926
3 ( )V Data K Lres , , 3 6.622 0.4139 1.167