degrees of freedom: a correction to chi-square for physical hypotheses

8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

1/47

De gree s of Free dom : A

Correc tion to Chi S quare

For P hysical Hypothese s

by J ohn Micha el Williams

2010-08-15

jmm will@comcast .ne t

Copyright (c) 2010, by J ohn Micha el William s

All Rights Rese rved


2/47

J . M. Williams Chi Squar e df v. 1.23 1

Abstract

In common practice, degrees of freedom (df) may be corr ected for t he n um ber of

th eoretical free para meter s as though para meter s were th e same as data cat egories.However, a free physical pa ra meter generally is n ot equivalent to a dat a cat egory in

terms of goodness of the fit.

Her e we use syn th etic, nonr an dom da ta to show th e effect of choice of

categorization and dfon goodness of fit. We th en explain th e origin of th e df

problem an d show how to avoid it in a th ree-step pr ocess:

First , the t heoretical curve is fit t o the dat a t o remove its

varian ce, leaving what, u nder th e nu ll hypoth esis, should be

structureless residuals.

Second, the r esidua ls ar e fit by a set of ort hogona l polynomials u p

to th e degree, should it occur , at which significan t va ria nce was

removed.

Third, th e nu mber of nonsignifican t polynomial term s in th e

origina l + ort hogona l set become t he dfin a sta ndard chi square

test.

This process reduces a general dfpr oblem to one of polynomial dfand allows

goodness of a fit t o be deter min ed by dat a cat egorizat ion a nd s ignifican ce level

alone. An example is given of an evalu at ion of physical dat a on neu tr ino

oscillation.


3/47


Table of Conten ts

Abstract ______________________________________________________________________ 1

Table of Contents ______________________________________________________________2

I. Introduction ________________________________________________________________3

Parametric Statistics_______________________________________________________________ 3

Statistical Tests in Physics __________________________________________________________ 4

Doing Physics by Category__________________________________________________________ 4

Categories in NonRandom Chi Square ________________________________________________ 6

II. Meaning of Chi-Square _____________________________________________________16

Formalisms______________________________________________________________________ 16

Categories in Random-Sample Chi Square ___________________________________________ 21

Meaning of df____________________________________________________________________ 21

Need for a Correction to df_________________________________________________________ 24

III. The Parameter Correction __________________________________________________ 26

Definition of Correction ___________________________________________________________ 26

Application in Synthetic Examples __________________________________________________ 30

Application in Neutrino Oscillation Theory___________________________________________ 42

IV. Conclusion _______________________________________________________________46

References___________________________________________________________________46

Acknowledgements ____________________________________________________________46


4/47


I. Introduction

Pa r a m et r ic S t a t i st i cs

Hypoth eses ar e theoretical assu mpt ions. Scientific hypoth eses ma y be tested

against empirical dat a. It is funda ment al to all science, that no hypoth esis ever can

be proven by experimental dat a; an hypoth esis only may be r endered more or less

likely, or disproven.

The m ost fam ous m odern u se of forma l sta tist ics to test scient ific hypotheses

dat es back to R. A. Fish er [1], who, in th e early twent ieth centu ry, published a

meth od for an alysis of th e var iance in experimenta l data . Fisher developed

sensit ive ways of detectin g differen ces a mong da ta as a fun ction of category of

tr eatm ent. The treat ment s in his favorite exam ples were on agricultu ra l variables

such as crop fert ilizat ion, light, water ing, an d so fort h. Fish er's meth ods werebroad in scope--the nonparametric Fisher Exact Testis na med for h im--but his

an alysis of varian ce was based on th e apparen tly narr ow assumpt ion th at t he data

were sampled independently an d would be normally distr ibuted with const an t

variance.

Defining a r an dom var iableXwith n orm ally distr ibuted probability density

( )N X; , 2 by

( )( )

P x e

x

X = =

1

2 22

2

2

, (1)

it easily ma y be seen th at th e two par am eters of such a distribution are th e mean

and t he variance 2 . The mean (expected value) of dat a ma y be ta ken to define th e

effect of an experimenta l treat ment . A sam ple ta ken of dat a depending on a

mixtu re of differen t effects will have a bigger varia nce tha n a sa mple dependin g on

one; so, the var ian ce ma y be an alyzed to show differen ces in th e un derlying mea ns.

If we use m ean values to represent cat egories of data, t hen hypoth esis testing by

an alysis of var ian ce ma y be used to test differen ces am ong the cat egories. Fish er's

app roach to ana lysis of var ian ce cam e to be kn own a s param etric statis tics .

Fisher's a na lysis of var iance rat iona le has been su pplemented by Bayesianmet hods [2, 4]. Pr oblems not fitt ing Fish er's par adigm ha ve extended th e scope of

hypothesis-testing statistics into the nonparametric realm [3, Ch. 7; 6, Sect. 9.2 &

Ch. 12]. Nonpara metr ic sta tistics att empt to resolve questions for dat a in very

sma ll samples, for mismat ched var iances, an d for experiment al r esults sometimes

more qualitative than numerical.


5/47


Sta t i s t i ca l Tes t s in Physics

A problem with hypoth esis testing in physics is th at mu ch of th e subject m at ter is

mult iply par am eterized. In physics, a good hypoth esis ra rely sta tes anyth ing aboutcat egories; instea d, it provides an explan at ion of a set of one or more ph enomena

which may var y continu ously. The dat a typically involve broad ra nges of ener gy or

moment um , or avera ges of ma ny car efully winnowed events . Quest ions ar ise of, Is

th e spin 1/2? Does th e field var y as 1/r? The conn ection of an y physical hypothesis

with data typically is through multidimensional continua of complexity.

Rath er t ha n an alyzing experimenta l variat ions t o discover differences in m eans

on na rr ow doma ins of one or m ore cont inuu a, th e preferr ed way to demonst ra te

supp ort for a physical hypoth esis is to show tha t it m ay be used u nequ ivocally to

cont rol some ph ysical pr ocess in su ch a way a s no oth er kn own h ypoth esis might

explain.

For exam ple, a demonst ra tion of th e Ha ll effect would involve an unequivocal

excess of cha rge along one side of a flat condu ctin g str ip. To achieve this

confirmation of the predicted effect, superior instrumentation or measurement of

previously ignored qua nt ities would be adopted. The gra dient of th e cha rge, th e

effect's time-course of development, and other factors would be examined for

consist ency with th e hypoth esis. Rar ely would physical meas ur emen t be on such a

na rr ow ran ge of values tha t t he var iance would be const an t, independent of th e

mea n. Also, if not all evidence su pported th e hypothes is, alt ern at ives would be

explored inst ead. Merely showing th at one side of a condu ctin g str ip seemed more

likely to have a char ge excess th an th e other, a test between two cat egories on one

variable, would be a n un sat isfying r esult.

Doing Ph ysics by Ca t egor y

Let's consider t he special case of curve-fitt ing hypoth eses. The problem is to

distinguish among various hypoth eses of th e shape of th e cur ves. We might sta rt

by ar ra nging th e data a long one or more cont inua . However, on a cont inuu m, two

data (probably) never coincide; this means we have slim bases for estimating the

mean , var iance, or a ny oth er par am eter of th e assum ed underlying random

variable.

So, to use par am etr ic st at istics, we define cat egory boun dar ies (bins). By ma kingth e bins wide enough, t he n umber of dat a a vailable in each will be large enough t o

provide a good estimat e of th e bin mean . We ma y assu me th e var iance in a bin to

be const an t, but only if th e bins a re na rr ow enough. Sometimes riskily, we may

assu me th e variance over all bins t o be the sa me, const an t value.


6/47


A simple example will illust ra te th is approach. Ima gine th at t wo different

physical theories ar e to be test ed against a certa in data set. Perh aps, the th eories

might r elate t o field int ensity Ias a fun ction of dista ncex.

Hypothesis I: The dat a follow a quadr at ic fun ction. This would st ren gth en beliefin Theory I.

Hypothesis II: The data follow a cubic function. This would support Theory II.

Here a re th e two hypoth etical predictions on a da ta set which h as been generat ed

purely artificially, with no randomization:

Figure 1. Simulated data f i t by eye . Deviations from the data see m

greater for the cubic than for the quadrat ic . The curves represent

( )I x= 7 1 22

for the quadratic, and ( )I x= 25 1 23

for the cu bic.

Clear ly, visua l inspection tells us th at t he quadr at ic fit is better. So, we

imm ediat ely should drop Theory II in favor of Theory I. Or, should we? How can wetest how well Theory I ha s been confirm ed? Is it possible Theory II might be more

elegant to derive, simpler, or oth erwise more a tt ra ctive tha n Theory I, even th ough

th is data set was not fit so well?

We shall discuss t he special case in which we wish t o test Theory I again st th e

dat a, leaving out Th eory II entirely. This kind of sta tistical test, which mea sur es

th e fit r at her th an rejecting a competing fit, is called a goodness of fittest.

Distance x (arbitra ry units)

Field

Intensity

I

0.0

0.2

0.4

0.6

0.8

1.0

1.2

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Quadratic Fit

Cubic Fit

Data


7/47


The m ost popular goodness of fit t est is by ch i sq u a r e , a sta tistic first derived by

Helmert [2, Sect. 3.4.3] and later independently derived and popularized by Karl

Pear son; it is na med by Pearson after th e Greek letter chi (). Chi squa re simply

is th e distribut ion of the su m of squar es of a set of stan dar dized, norma lly

distributed data .

Categor ies in NonRand om Chi Squa re

Before pr esent ing form alities a bout chi squ ar e, we first develop its a pplicat ion t o

th e cur ve-fit exam ple in F igure 1 above. We ignore th e cubic cur ve for a wh ile.

One Cat egory

Suppose we comput ed th e mean values of the dat a a nd of th e quadr at ic cur ve on

th e interval of interest , 01 0 9. . x . We would get, usin g all 14 ava ilable dat a

points sh own in F igur e 1, a sta tistically invalid ana lysis but a n inst ru ctive exam ple

as sh own in Ta ble 1.

Table 1. Comparison of data vs quadratic means on one

category (1 degre e of freedo m)

Distance x Data

of I

Quadratic

0.151 0.73

0.201 0.59

0.251 0.44

0.301 0.300.351 0.18

0.401 0.08

0.451 0.02

0.501 0.00

0.551 0.02

0.601 0.08

0.651 0.18

0.701 0.31

0.751 0.45

0.801 0.60

mean = 0.2855 I = 0.3733

sa mple sd ofI= 0.2349 I = 0.3339

sa mple sd of m ea n 0.0628 I

= 0.0892


8/47


The mea n of th e qua dra tic was compu ted a s follows, using th e expected valu e

operator,E,

( )( ) ( )

I E Id x I x

dx

dx x

x= =

=

0 1

0 9

0 1

0 9

2

0 1

0 9

0 1

0 9

7 1 2

03733.

.

.

.

.

.

.

. . . (2)

The operator ( )E means t he same a s the familiar ;E is used her e for symm etry

with th e var iance operat or, ( )V .

The standard deviation of the quadratic was computed as follows,

( ) ( )( ) ( ) ( )[ ]( ) ( )

I V I E I E I E I E I sqrt d x I x

dx

d x I x

dx

= = = =

2 2 2

2

0 1

0 9

0 1

0 9

0 1

0 9

0 1

0 9

2

.

.

.

.

.

.

.

.

( )( )[ ]I sqrt

dx x

dx

=

7 1 2

0 373 0 3339

2 2

0 1

0 9

0 1

0 9

2.

.

.

. . . . (3)

The sta nda rd deviations of th e mean s were comput ed by dividing the sta nda rd

deviations of the variables (data or quadratic) by the square root of the sample size,

N = 14 3742. .

Now, inst ead of just noticing in Table 1 tha t t he mea ns a re about 0.09 apa rt , and

th at t he sta nda rd deviat ions of th e means a re about th e same size, suggesting no

significan t difference between t he t heory a nd th e dat a, let's look a t th e chi squa re

statistic. A unit-normal distribution is defined as one with m ean 0 an d varia nce 1,

we describe such a distribution as ( )N 0 1, , a notat ion which may be relat ed to

equa tion (1) in a n obvious way.We follow Brownlee [3, Sect. 1.27], an d other s in defining chi squ ar e as a sum of

squar ed un it-normal deviat es: Given a nu mber NC of data cat egories,

( )Data ii

N

NC

2 2

1

0 1==

, , (4)


9/47


with comput ed or ta bulated cum ulat ive probability P at significan ce level xp ,

( )[ ]P df x pp2 , represented as ( )p df2 .

In comm on usa ge, which we shall quest ion lat er, th e degrees of freedom var iable

dfis the n umber of sta tistically independent cat egories of th e data , and t he un it-normal deviates are differences between theoretical (or otherwise anticipated)

values an d data values, one for each cat egory. Becau se the differen ces are defined

sta nda rdized, their mea ns a ll will be expected t o be 0 under t he nu ll hypoth esis tha t

the fit is good.

Usua lly, the var iance in ea ch dfcat egory would be th e varian ce un der t he nu ll

hypoth esis; in our int roductory exam ples, th is is the var iance under th e assu mpt ion

th at th e data were distr ibuted ran domly in each cat egory as determin ed by th e

cur ve being fit. In t his intr oduction, we sha ll be looking at t he cur ve fit to dat a

generat ed under a th eory an d exam ining the adequa cy of th e representa tion of th at

th eory by th e fitt ed cur ve. So, we are looking at two differen t t heories, in a sen se,one a simplificat ion of th e other . In our int roductory examples, ther e is no ra ndom

component other th an rounding error of th e nu merical values.

In common u sage, we would estima te t he var iance in each cat egory u sing th e

dat a, th us losing a degree of freedom [4, Sect. 10.11]. In t hese int roductory

examples, we assume the appar atu s retur ns good data , and th at we are measuring

th e lawful realization of a physical principle. We assu me we know the mean s and

th e cha nge in th em within a cat egory; so, we need not use da ta to estimat e

an yth ing; we mer ely ar e mea sur ing th e fit of a cur ve. So, in the first few examples

following, we sha ll use th e varia nce of th e qua dra tic cur ve to sta nda rdize in each

cat egory, rath er th an t he varian ce of th e equally th eoretically corr ect dat a. Theactua l difference in t he var iances is not lar ge anyway, as ma y be seen in th e

tabulations below.

So, to evalua te chi squa re, we ma y compu te t he following sum :

( )

( )

( )[ ]( )

Datai i

ii

N

i

ii

NX E X

V X

X E X

V X

C C

2

2

1

2

1

=

=

= =

, (5)

in which, in each category, the t heoret ically expected va lue ( )E X is subtracted from

th e value of the observed datu m X. In genera l, each i-th cat egory r epresents a

mea n of some observat ions, from which observed mea n will be subt ra cted t he i-th

expected mean . Under t he nu ll hypoth esis th at t he fit is good, the difference in the

nu mer at or of th e sum in (5) will, of cour se, be zero. Dividing each differen ce by the

expected standard deviation ofX sta nda rdizes the var iance of each of th e i elements

in the sum.

We notice here, immediat ely, th at our chi squa re ignores t he t heoretically

expected shape of th e cur ve: It just sums squar ed differences. This mean s tha t chi


10/47


squar e is not tr uly a para metr ic sta tistic. Ea ch cat egory is sta nda rdized

individually by its own expected va ria nce; th erefore, chi squar e does not r equire a

const an t varian ce over all cat egories tested. This is a great advant age in physics.

The sha pe of the t heoretical curve enter s only locally, at th e gran ular ity of the

individua l cat egories, when a differen ce ( X Ei i ) is taken an d a variance estima tedperha ps on th e assum ption t ha t t he varian ce will be const an t.

Anyway, let's decide to reject t he fit at th e p = .001 level. Keep in min d th at we

do not have a valid ran dom sam ple an d tha t t his is mean t as a th ought -provoking

int roduction to a problem of sta tist ical inference in physics. The ta bles of chi

square tell us tha t ( ).0012 1 11 . We tr ivially comput e chi squar e from the Table 1

means t his way:

( ) ( )( )

DataX E

V2 1 1

2

1

2

2

2855 3733

089210=

=

. .

.. ; (6)

an d, confirmin g our visual impr ession, th e resu lt is not significan t, so th e fit ma y be

considered good to th e extent t he overall averages of the dat a a nd qu adr at ic

hypothesis in Table 1 are not shown differen t. And we ar e confident th e fit would

have been shown good with likelihood of error p = .001, ha d th e stat istical

requirements been fulfilled.

We should ment ion h ere th at th e difference in m eans in Table 1 also might h ave

been tested by oth er sta tistics, such a s Stu dent's t. Above, we have accepted t he

mean an d varia nce of the qua dra tic fun ction a s th eoretically errorless, and ha ve

accepted the quadratic's computed ( )N . ,.3733 0892 distribution a s th at of the data ,

un der the nu ll hypoth esis. In cont exts oth er tha n th e present int roduction, this

would be qu ite wrong sta tistically, becau se, visibly, the mean an d t he varian ce

cha nge systemat ically over t he domain of interest (sta nda rd deviation is a measu re

of slope)--an d we ha ve a qu adr at ic cur ve, not a ra ndom norma l variat e.

As a check, ignoring t hese pr oblems, at p = .001 (two-ta iled), a norm ally

distributed da ta mean would ha ve to lie about 3.3 stan dar d deviat ions of th e mean

awa y from th e qua dra tic mea n to reject th e goodness of th e fit. This would be about

0.30 distance units, clearly about 3 x greater than the 0.09 difference in Table 1.

Usin g the n orm al significan ce value in a second check, if we subst itu te 0.30 in th e

chi squ ar e nu mer at or of (6) above, we get 0 30 0 09 1112 2

. . . , closely mat chin g th e chisqua re significan ce level of 11. This is a t au tology, confirm ing our ar ith met ic,

because by definition ( ) ( )[ ]. . ,0012 0012

1 0 1 N .

Two Ca tegories

Now, let us double the nu mber of cat egories an d th e sam ple size, too. We

increase t he density of the dat a point s but keep the sa me qua dra tic fitting function:


11/47


Distance x

Field

Intensity

I

-0.1

0.1

0.3

0.5

0.7

0.9

1.1

0.2 0.3 0.4 0.5 0.6 0.7 0.8

x1x2

Figure 2. Two-category fi t by eye . More of the same arti f icial data as

in Fig. 1 , for the sam e qu adratic f i t ( ) ( )I a x b x= = 22

7 1 2 .

This time, we part ition t he domain of an alysis x int o two regions, or bins, called

x1 and x2 , as shown in Figure 2. The horizont al lines show th e different m eans ofI

in the two bins. We ta bulate th e data shown, an d recalculat e the para meter s of th e

qua dra tic separ at ely for th e two ha lves, as was done above. We cha nge the problem

slightly here, by integrat ing th is time on t he sm aller, more pr ecise doma in of

( )x . ,.15 85 instead of ( )x . ,.10 90 , mak ing the domain of evaluat ion of th e quadr at ic

ma tch the doma in of th e sampled dat a more closely. Ea ch ha lf now ha s 14 dat a

points. The result is in Table 2.

Looking at Figure 2, with two categories, the obviously-changing data at least are

more or less monotonic in ea ch bin . As confirmed in Ta ble 2, th e two bins, dividing

th e data exactly in h alf at a point of symmetr y, mak e th e two quadr at ic cat egories

almost ident ical. The quadra tic sta tistics all ar e lower in Table 2 th an in Table 1

becau se of th e sma ll reduction in t he domain of int egration, which elimina ted t he

largest values ofdI dx . The sam ple sta nda rd deviat ions and sta nda rd deviat ions of

the sample means in Table 2 agree reasonably closely with those in Table 1, because

we ha ve essent ially th e sam e functiona l cha nge in each "data " bin in both cases,

ma king slopes all about t he same. However, the quadra tic cur ve-fit stan dar d

deviat ions reflect th e nar rower doma in of integrat ion. If we calculat ed a stan dar d


12/47


deviation of th e gran d mea n in Table 2, using both bins, we would find it 2

sma ller th an in Table 1, becau se of th e doubling of th e dat a (sam ple size).

Table 2. Comparison of data vs quadratic mean s on two

categories (2 degrees of free dom)

x1: Low Half x2: High Half

x Data Qu ad x Data Qu ad

1 0.151 0.7338 0.501 0.0000

2 0.175 0.6677 0.525 0.0053

3 0.201 0.5926 0.551 0.0220

4 0.225 0.5216 0.575 0.0471

5 0.251 0.4445 0.601 0.0844

6 0.275 0.3747 0.625 0.1273

7 0.301 0.3023 0.651 0.18228 0.325 0.2396 0.675 0.2396

9 0.351 0.1777 0.701 0.3077

10 0.375 0.1273 0.725 0.3747

11 0.401 0.0811 0.751 0.4504

12 0.425 0.0471 0.775 0.5216

13 0.451 0.0203 0.801 0.5985

14 0.475 0.0053 0.825 0.6677

mean (n=14) 0.3097 I = 0.2858 mean 0.2592 I = 0.2867

sam ple sd ofI 0.2494 I = 0.2557 sd I 0.2304 I = 0.2556

sa m ple sd of m ea n 0.0667 I

= 0.0683 sd m 0.0616 I

= 0.0683

As before, ima gining th e two cat egories t o be sta tist ically indepen dent , we ma y

look for a rejection of goodness of fit in Figure 2 at the p = .001 level at ( ). .0012 2 138 .

We ma y compu te t he va lue of chi squ ar e from Ta ble 2 as follows:

( ) ( ) ( )( )

( )

( )Data

X E

V

X E

V2 1 1

2

1

2 2

2

2

2

2

2

2

3097 2858

0683

2592 2867

06830 28=

+

=

+

. .

.

. .

.. ; (7)

Again, we get nowhere nea r a value a llowing us to reject goodness of fit; in fact,

we are far th er away, becau se the slight adjustm ent of th e doma in away from th e

largest values ofI (above) ha s r emoved th e worst deviations from a per fect fit.

Two Cat egories with More Da ta

Now, supp ose we complete our experimen ta l work an d ha ve a set of some 1,000

dat a a t equa l 0.001 int ervals on t he doma in ( )x 0 1, . As in Figur e 2 an d (7) above,


13/47


we ha ve decided to keep just ( )x . ,.15 85 . This mean s we will keep just 700 usefuldata .

With out ta bulating da ta , the r esult of a chi squar e test on two cat egories with 350

dat a ea ch is shown in Table 2a:

Table 2a. Comparison of num erous data vs quadratic

means on tw o categories (2 degrees of freedom)

x1: Low Half x2: Hig h Half

Data Qu ad Data Qu ad

mean (n=350) 0.2831 I = 0.2858 mean 0.28520 I = 0.2867

sam ple sd ofI 0.2321 I = 0.2557 sd I 0.23290 I = 0.2556

sample sd of mean 0.01241 I

= 0.01367 sd m 0.01245 I

= 0.01367

Of cour se, the qua dra tic means an d var iances have not chan ged; however, we now

must sta nda rdize th e cat egory means by the much lar ger sample size. The result is,

( ) ( ) ( )( )

( )

( )Data

X E

V

X E

V2 1 1

2

1

2 2

2

2

2

2

2

2

2831 2858

01367

2852 2867

013670 05=

+

=

+

. .

.

. .

.. ; (7a)

an d th is is even fart her from the significan ce level of 13.8. We th us might conclude

from chi squa re on two cat egories t ha t t he fit of th e qua dra tic was good.

Ten Ca tegories

As a bove, we ha ve 700 usa ble, equa lly-spaced dat a on ( )x . ,.15 85 . Let us test the

fit of th e quad ra tic fun ction aga in, th is time dividing our x doma in int o 10

cat egories. Using "box plot" repr esent at ion, th e result , aga in a fit by eye, is in

Figur e 3. Ea ch of th e 10 bins ha s been processed as we did in th e preceding

an alyses. Again, we ha ve such pr ecision of measu remen t of th e dat a in t his

ar tificial exam ple, th at th ere is no legitima te r an domn ess.


14/47


Distance x

Field

Intensity

I

0.0

0.2

0.4

0.6

0.8

0.15 0.45 0.85

Mean sd of I

Mean sd of Imean

Figure 3. Ten-catego ry f i t by eye. Sum marized set of 700 of the same

arti ficial data as in Figs . 1 and 2, for approximately the s ame quadratic

fit, ( )I x= 7 1 22

.

Notice in Figure 3 t ha t t he var iance in ea ch bin clear ly increases with t he local

slope of th e cur ve. Ea ch mea n repr esents 70 data ; th e equa l-spacing of th e data

project on the local slopes to determine the standard deviations shown.

Also notice th at th e large num ber of 70 data in each bin, tr eated a s th ough

sam pled ran domly and independently, yield such a sma ll stan dar d deviat ion of th e

bin mean , tha t for only one of th e dat a, th e second from th e left, does th e quadr at ic

actua lly fall with in a sta nda rd deviation of th e mean .

To prepa re our chi squa re t est of goodness of fit of th e qua dra tic, we sum ma rize in

Table 3 the sta tistics plott ed in Figure 3:


15/47


Table 3. Comparison of data vs quadratic means on ten

catego ries (10 degrees of freed om)

Bin #(70 data)

Binmean

Bin sd sd o f Bin

mean

Quadmean

Quad sd sd of Quad

mean

2

e lement

1 0.63680 0.05840 0.006976 0.695150 0.087730 0.010486 30.97

2 0.43190 0.05945 0.007106 0.421240 0.068241 0.008156 1.71

3 0.24020 0.05077 0.006069 0.215930 0.048726 0.005824 17.37

4 0.09301 0.03404 0.004068 0.079220 0.029227 0.003493 15.58

5 0.01348 0.01220 0.001459 0.011110 0.009933 0.001187 3.99

6 0.01407 0.01253 0.001498 0.011599 0.010206 0.001220 4.10

7 0.09468 0.03432 0.004102 0.080689 0.029524 0.003529 15.728 0.24270 0.05096 0.006091 0.218379 0.049003 0.005857 17.24

9 0.43480 0.05901 0.007113 0.424669 0.068529 0.008191 1.53

10 0.63970 0.05828 0.006966 0.699560 0.088010 0.010519 32.38

Value of ( )Data2 10 141

But, ( ).0012 10 is about 29.5; so, now, with a lar ger sam ple and df= 10, we easily

reject th e nu ll hypoth esis and conclude t ha t t he fit by th e quadr at ic is not good to

explain t he dat a.

Fifty Categories

Before leaving th is intr oduction, we look at our nonr an dom dat a once more to see

wha t would ha ppen if we tried a chi squa re goodness of fit t est a s above, but with

th e dat a subdivided into more nu merous categories, say, 50 of them.

A priori , if we had some ra ndomn ess, we might hope tha t 50 categories would

ma ke for fewer dat a per bin an d th us lar ger varian ces of th e mean in each bin; so,

with higher dfat th e significan ce point , th e cha nce for a good fit m ight be im proved

over what it was with 10.

The r esult is plott ed in F igur e 4 a nd dr am at ically shows th e effect of systema tic

versus ran dom variat ion. With 50 cat egories and 700 data, the real shape of th edat a cur ve becomes evident ; and, quit e cont ra ry to our conclusion ba sed on Figur e 1

or 2 a bove, an d ma ybe to our first impression of Figure 3, we see tha t t he qua dra tic

isn't even close to a good fit to the da ta . Perh aps a slightly bett er qua dra tic might

ha ve been chosen by eye, but th e sha pe of th e dat a cur ve obviously prevent s chi

squa re ever from acceptin g th e qua dra tic on 50 degrees of freedom.


16/47


Data: I = 1 - cos(x-.5) + cos( 3(x-.5) ) - cos( 5(x-.5) )

Distance x

Field

Intensity

I

0.0

0.2

0.4

0.6

0.8

0.15 0.50 0.85

Mean sd ofI

Mean sd ofImean

Figure 4. Fifty-catego ry f i t by eye. Sum marized set of 700 arti ficial

data for the qua dratic f i t , ( )I x=

7 1 2

2

. The funct ion used to generatethe data throughout the Introduct ion i s g iven in the graph t i t le .

Tabulation of the Figure 4 statistics has been omitted; however, the significance

point is ( ).0012 50 87 ; the obtained value is comput ed at Data

2 304= ; so, again , th e chi

squar e test r eassur es us th at we should reject t he fit as n ot good.


17/47


II. Me an in g of Chi-Squ are

At t his point, we leave th e intr oductory examples for some form alism a bout th e

sta tist ics. Gener alizat ion of th ese forma lisms to th e mu ltidimen siona l case [4, 5] isnontrivial but rea sona bly str aightforwar d.

For m a li sm s

Relation to Gam ma Distribut ion

The factorial of an int eger n , defined a s ( ) ( )Fact ! ...n n n n nii

n

= = =

1 11

, appears in

combinat orial an alysis everywhere. The factorial ma y be expressed as a special

case of the cont inu ous gamm a fun ction . The gamma fun ction is discussed in th e

cont ext of Bayesian inferen ce in [6, Sect. 7.3 ff.]; the der ivat ions below follow th oseof Feller [7 vol II, Ch. 2]. It isn't mu ch of an overst at ement t o say tha t t he gamm a

fun ction is as funda ment al t o stat istical inference as th e exponent ial fun ction is to

th e solut ions of differen tia l equat ions.

The gamm a function is defined as,

gamma(x) ( ) =

x d ex 1

0

. (8)

Integrating (8) formally by parts,

( ) ( ) xx

d ex

xx= = +

1 1

10

; th erefore,

( ) ( ) x x x+ =1 ; an d so, (9)

( ) ( ) ( ) ( ) x x x x= 1 2 2 , (10)

which clearly shows t he r ecur sive relat ion ma king ( ) ( ) x x= 1 !, for th e special case

ofx an integer.

Using th e gamma fun ction in (8), we ma y define t he gam m a probability density

of random variable Xon ( )0, by,

( ) ( )( )

P X x x e x= = = gamma ,

1

1

. (11)


18/47


Looking at (11), we notice that the familiar exponential density then may be

defined a s ( )gamma , 1 ; and s o, by ana logy, in (11) we will ha ve th e mea n ( )E X =

and t he variance ( )V X = 2 .

In a sma ll digression, recall th e combin at oria l express ion for t he n um ber of ways(un order ed) of ta king a sam ple ofj objects from a discret e populat ion ofm of th em:

( )( )( )

( )( ) ( )

( ) ( )m

j

m

j m j

j m j

j m j

j n

j n

j n

j n

=

=

+

=

+=

+ +

+ +

!

! !

!

! !

!

! !

1

1 1. (12)

The discrete case ma y be genera lized to the beta integral, in t erms of th e gam ma

fun ction, sta rt ing this way:

( )( )

( ) ( )

B ,

=+

1

. (13)

Oth er u ses of th e gamm a fun ction ar e discuss ed in [4, Ch. 4] an d [7 vol II, Ch. 2].

Retur ning to th e subject, we may r edefine th e un it norm al density as in (1) above,

in th e following way: Fr om (8) an d (11), settin g to 1 2 and to 1 2 yields,

( )gamma x ed e

xe

xx1

2

1

2

1

1

2

1 21 1

2

112

1

2

1

21

2

0

2,

=

=

. (14)

The definite integra l ( )1 2 evaluates t o ; so,

( )

gammax

e

x1

2

1

2

1 1

2 12

1

2

0

1

2

2

,

=

. (15)

In th e brackets in (15), we have x in a unit norma l distribution. Becau se

( )1

0 1x

N X; , is the density of th e squar e of th e unit norma l ran dom var iableX [7

vol II, p.47], we ma y say t ha t

( )[ ]gamma N 1

2

1

20 1

2

, ,

. (16)

So, (16) mea ns t ha t ( )gamma ,1 2 1 2 represent s th e probability density of th e squar eof a ra ndom variable Xdistributed a s ( )N 0 1, . It is easy to see th e genera lizat ion of


19/47


(16) to a sum ofn variat es, each distr ibuted as ( )N 0 1, : Therefore, reca lling (4)above, we ma y write

( )gamman

n1

2 2

2,

; an d, (17)

th is is just th e definition of th e distr ibut ion of chi squar e.

Thus, we have used the gamma density to show th at a sum ofn squared unit

normal varia tes will be distributed a s ( )2 n . By an alogy to (11) above, th e mean of

( )2 n will be n , and th e varian ce will be 2n .

Moment Generat ing Fun ction

Not ma ny rea ders would see th e relat ion of (15) to (16) above. An ea sier way to

show the r elation bet ween th e normal a nd chi squar e probability densities is bycompar ison of th eir moment genera tin g fun ctions (mgfs).

Th e m om ent generating fun ction of a r an dom varia ble is a s eries expan sion of its

densit y which, t erm by term , yields as coefficient s th e n -th moments of th e ra ndom

variable about t he origin. The first moment is the mean , the second t he variance,

an d so fort h. What is import an t for us here, is tha t two ra ndom var iables, un der

very genera l conditions, ha ve the sa me density if and only if all th eir moments ar e

identical [6, 6.7.8; cf. 6.3.3]. And, if th ey both h ave the sam e mgf, th eir moment s all

will be identical.

The mgf of a r an dom variable X is defined [6, Sect . 7.1] for cont inu ous

distributions on ( ) , as

( ) ( ) ( )mgf ; f t X E e dx e X xtX tx= = =

, (18)

in wh ich ( )f X x= is th e probability density function ofX. The momen ts will be

obta ined by a power series expan sion on t; the int erval of series convergen ce does

not concern us here.

The rest is very simple: Fr om (14),

( )( )f X x

xe

x

2 1

21

2

1

2

1

2

1=

=

gamma , ; so, from (18),

( )( ) ( )mgf ;t dx e xx t

2 1 21

211

2=

. (19)


20/47


Cha nging var iables in (19) with y x= 1 2 an d fixing t he lower limit of int egrat ion

immedia tely yields, for chi squa re,

( )

( )

( )

mgf ;t dy et

y

2

1 2

2

0

11

2

2

=

. (20)

On th e oth er ha nd, the un it normal density (1) ma y be writt en,

( ) ( )f ,N

x

x e0 1

21

2

2

X = =

. (21)

So, th e mgf for X2 will be,

( )( )( )

mgf ; ,t N dx e dx etx xt

x2 2

0

1 2

2

0

0 11

2

1

2

2 22

= =

, (22)

which clear ly is th e sam e as for t he chi squa re in (20).

While on m gfs, it sh ould be mentioned tha t t he mgf also is perhaps t he easiest

way to derive the var ian ce of th e Poisson dist ribu tion [7 vol I, Sect.6.5 - 6.7] . The

Poisson distr ibut ion, wh ich gives t he (discret e) probability ofk events each with

small probability , has probability function [6, 6.3.7],

( )P k ek

k

X = =

;!

, (23)

an d may be shown to have mean = var iance = . For n repeat ed Poisson t rials, the

mean will be ( )nE X n= , and th e varian ce will be ( )nV X n= . For n repeated

tr ials, the sta nda rd deviation of th e Poisson m ean will be ( )nV X n = n n

= ; so, by the centr al limit t heorem, we would expect a Poisson dist ribu tion sum

ofn repeated trials with mean to approach a norma l distr ibution with m ean n

an d stan dar d deviation of th e mean n (= stan dar d error).

Sufficiency

A sample of a ra ndom variable may be seen a s a measu re of th e inform at ionknown about t he un derlying physical process. The sample size and th e var iance of

th e measu remen t determine the inform at ion in th e sam ple: Var iance adds unu sable

inform at ion (possible inter pret at ions of th e physical pr ocess) and so redu ces th e

precision of th e sam ple in represen tin g th e physics. Sam ple size decrea ses the

var ian ce of th e mean of th e sam ple and so increases t he pr ecision of th e sam ple in

repr esent ing the mean of th e physics. Becau se physical processes are of little value

un less repeat able, an d becau se the mea n ma y be used as an expecta tion of fut ur e


21/47


repetitions, size of the sample and variance of the measure work inversely in

determining th e inform at ion in a r an dom sam ple.

Given a variety of sta tistics describing a ra ndom sam ple, a sufficientstatistic [6,

Sect. 10.4] is one wh ich describes t he da ta with no loss of inform at ion, so th at

inform at ion beyond th at of the su fficient sta tist ic does n ot im prove prediction of

fut ur e samples. Given th e same ph ysical process, a set of measu rement s of sma ll

var ian ce genera lly will be sufficientto predict a set with greater var iance: The

latt er set m ay be viewed as th e form er set plus some extr a r an dom variat ion.

Gener ally, sufficiency requires choice of a sta tist ic good en ough to repr esent th e

un derlying ph ysical p rocess; oth erwise, err or in choice of sta tist ic will add

system at ic var ian ce which will redu ce predictability. An example of th is was in the

int roductory example a bove, in which t he wr ong choice of cur ve (qua dra tic) add ed

systematic error--clearly systematic, because there was no random error.

In t he present cont ext, we may see chi squa re a s a wa y to represent th e goodness

of choice of cur ve, such goodness being an un avoidable pr erequ isite t o exam ina tion

of th e ran dom err or in th e data . This goodness must be relat ive to th e cat egories of

th e data , not t o compet ing cur ves. But , in an y case, we can not look at s ufficiency

without goodness of fit.

So, how can we know tha t chi squar e might be u seful generally as a test , when,

perha ps, the chi-squar e assu mpt ion of a su m of squar es of unit n orm al deviat es

might itself be bad?

Centr al Limit Theorem

The goodness of chi squ ar e as a test st at istic for goodness of fit of rea sonably

large samples generally need not be questioned. The reason is th e central limit

theorem [7 vol II, Sect. 8.4 & Ch . 14], which h olds u nder alm ost all condit ions in

which st able appar at us produces consistent dat a:

T he sum X of any set of ind epend ent ran dom variates, regardless

of how distributed, will approach a n orm al distribution

( )N X X ,2 as the sam ple size approaches infinity.

Of cour se, that being tr ue, the sa mple mean will represent ( )E X X and the

sam ple varian ce will represent ( )V X X 2 .

As app lied to th e chi squa re quest ion, if th e dat a in ea ch cat egory fulfil therequiremen ts of th e centr al limit t heorem, the mean in th at cat egory always may be

sta nda rdized for u se as a valid element of a chi squar e sum.

Fu rt herm ore, if errors of measu rement cau sed by instr um enta tion ar e themselves

subject t o th e centr al limit t heorem, an d th erefore a re n orma lly distributed, it can

be proven [7 vol II, Sect. 2.2] th at th e convolut ion of th e instr um ent at ion er ror with


22/47


th e physical process also will be subject t o th e cent ra l limit t heorem, as sum ing a

reasonably consistent , repeat ably measu ra ble process.

Categor ies in Ra nd om-Sa mp le Chi Squa re

Nu mber of Cat egories

Above, it was st at ed th at a su fficient st at istic will have less variance tha n one

which is not, for a good fit an d the sam e data . Fr om (17) an d comm ent s

imm ediat ely following, as t he dfof chi squ ar e increases, its va ria nce increa ses, too.

However, t o increa se df, we cha nge th e experiment by cha nging the way th e

cat egories for avera ging the dat a ar e defined. Chi squa re represents th e cat egories,

not the data .

In t erms of the dat a, a set of mean s on ma ny cat egories (in th e limit, one cat egory

per dat um ) always may be combined to ma ke fewer categories. In th is sense, a chi

squar e of great er dfalways will be sufficient relative to one of lesser df, for a given

set of dat a. Looked at in ter ms of individua l cat egories, a wider cat egory mean s

great er likelihood of a cha nge in t he ph ysics, ma king for a bigger var ian ce, given th e

sam e set of dat a being cat egorized. So, in chi squa re test ing, more num erous,

na rr ower cat egories (alwa ys fulfilling n orm ality) will be sufficient rela tive t o fewer,

wider ones. The more th e df, th e closer to sufficiency.

Number of Par ameters

For a given da ta set, it seems obvious t ha t a dding free th eoretical para meter s will

ma ke th e fit of th e theory bett er. In t he limit, one always may fit N dat a cat egories

by an ( N 1)-th degree polynomia l, result ing in zero var ian ce in each cat egory a ndth erefore a perfectly good chi squa re fit. However, this would be very unint erest ing

physics--to claim, in t he limit, th at everyth ing was a certa in n um ber of term s of its

own, u nique polynomial.

Clearly, a physical goodness of fit somehow should involve the complexity of the

hypothesis, as well as t he cat egorizat ion of th e dat a. The issue her e is, how should

we view th e degrees of freedom in a goodness of fit t est wh en we a re free t o var y

both t he dat a categories and th e number of th eoretical par am eters? Is every theory

with n um erous par am eters an d wide-sweeping flexibility bett er t ha n one which

leaves some variance unaccounted?

Mea n ing of d f

Bock [5, Sect. 2.6.3] defines degrees of freedom in ter ms of the d imen sion of a

subspace on which th e dat a ar e projected.

Brownlee [3, Sect. 8.1] defines d egrees of freedom a s "th e n um ber of var iables

minu s th e num ber of independent linear r elations or const ra ints between th em [an d

equa l to th e degrees of freedom of th eir su m of squa res]."


23/47


In describing th e chi squa re a pproximat ion t o the mu ltinomial distr ibution,

Brownlee coun ts th e degrees of freedom a s t he n um ber of degrees of freedom for t he

chi squar e minus t he nu mber of par am eters estima ted in order t o perform a cur ve

fit. In an exam ple [3, Sect. 5.2], estimat ion in the sam ple ma kes the dfof th e chi

squar e equal to th e number of cat egories minus 1. He assum es a Poissondistribution for which he su btr acts a nother 1 from df, because the Poisson density

ha s one free par am eter. Thus, Brownlee's final dfis equa l to the nu mber of

categories minus 2.

If Brownlee's appr oach wer e a pplied to our problem of coun tin g df, we would u se

th e form ula,

( )df N N C P= 1 , (24)

in wh ich NC is nu mber of cat egories an d NP is nu mber of free par am eters. At least

one popula r st at istical softwa re pa cka ge uses form ula (24) for chi squ ar e goodness offit.

For exam ple, consider th e ar tificial dat a in F igure 4b, an d th e effect of sequen tia lly

fitt ing th em with a line of slope b an d then with a const an t function a , each t ime

subtr acting away the fit from the dat a:


24/47


(a) Data( x )

x

0.00

0.25

0.50

0.75

1.00

0 2 4 6 8 10 12 14

(b) Data(x ) and fit of slope b = 0.058

x

0.00

0.25

0.50

0.75

1.00

0 2 4 6 8 10 12 14

f1

(x) = 0.058 x

(c) Data( x ) - f1(x) and fit of a = 0.112

x

0.00

0.25

0.50

0.75

1.00

0 2 4 6 8 10 12 14

f2

(x ) = 0.112

(d) Data(x) - f1

(x) - f2

(x )

x

-0.25

0.00

0.25

0.50

0.75

0 2 4 6 8 10 12 14

Figure 4b. Effect of l inear parame ters on residuals . The data we re

fit by f(x) = a +bx = 0.112 + 0.058x before the success ive su btract ions .

Using these two para meters a and b in a cur ve fit in some sens e does redu ce th eort hogona lity of th e remainder (residuals). The smaller vertical ra nge means t he

varian ce was reduced, too. With 12 data an d 2 par am eters t o fit, it would seem

reasonable in th is exam ple to use the r ule above to test goodness u sing ( )p2 9 , with

dfcomputed as 9 = (12 - 1) - 2. For exam ple, in Figur e 4b, if theory forced a = 0.10,

th en only b would be fit, a llowing a test on ( )p2 10 , and possibly ma king a ccepta nce

of th e fit more likely for th e same dat a. Clearly, in such an exam ple, th e accepta nce

would be m ore likely for a th eoret ical valu e ofa equal t o its value best fit to th e

dat a. So, a "str onger" th eory which predicted par am eters a ccur at ely without dat a

would get preferen ce in goodness. The Galahad Effect, one m ight sa y, in wh ich a

cur ve gains st rength becau se its th eory is pure.

The problem with th e appr oach of simply subtra cting t he n um ber of free

par am eters is th at it tr eats everything as linear a nd ort hogona l. It gives no more

import an ce to a pa ra meter of the curve tha n t o just one of th e possibly num erous

cat egories. This appr oach seems unqu estionably valid when the cur ve to be fit is

linear. In other cases, although the cat egories might be linearly independent a nd


25/47


cont ribute equa lly to the r an dom var iance removed by th e fit, const ra ints on t he

par am eters in genera l will int roduce nonlinear const ra ints on th e cat egory mean s.

Anoth er, en tir ely differen t problem, is th e efficiency of the estim at or us ed for a

par am eter to be fit. Sur ely, we should subtr act a dfwhen t esting a polynomial

hypoth esis, if we ar e fitting an intercept para meter u sing th e data mea n. But, what

if we ar e using t he less efficient, nonpar am etric stat istic, th e median, inst ead of th e

mean ? The median only accoun ts for th e ran k order of th e data a nd is invar iant

un der an y strictly monotonic tr an sform of th e data . Should we subtr act 1 2 dfif we

use th e median of th e data to estima te th e hypoth etical intercept?

For a t heorist interest ed in an objective test of a h ypoth esis with a small nu mber

rof free pa ra met ers, t estin g by (24) above on a lar ge nu mber of cat egories, sa y 10r

or more, essentially ignores th e nu mber of par am eters. Yet, under the null

hypoth esis that th e data ar e well fit by the cur ve, th e data in some sense should be

expected t o be interdependent in a way similar to the wa y they respond to small

chan ges in th e para meters.

Each parameter is critical, to an h ypoth esis with two par am eters to fit. With 100

cat egories of dat a, th ough , dropping ten categories might n ot ma ke mu ch differen ce

in t he conclusion.

Need for a Cor r ect ion t o d f

The pr acticality of the pr oblem ma y be seen by compar ing two hypoth etical

cur ves as shown h ere, one a sine

cur ve with 3 free par am eters

(vertical displacement, phase, andfrequency) and t he other a 2-term

polynomial (cons ta nt + coefficient

of square).

Sup pose we were a llowed to

adjust the param eters shown. If

th e data ra nged only over a 1,

th e sine fit might be bett er th an

th at of the polynomial; with th e

dat a a s shown, a good sine fit

might be hopeless. By th e ru leabove, we would u se a more

str ingent test (more likely to reject goodness) for t he sin e cur ve tha n for t he

polynomial.

We note t ha t a dding a free para meter to the sine (say, amplitude) or a couple of

more ter ms to the polynomia l, might m ak e the fit good in both cases. Alas! In th e

cont ext of a th eory, man y par am eters will be const ra ined by the physics a nd

x

-1.00

-0.50

0.00

0.50

1.00

1.50

2.00

2.50

0.0 0.3 0.5 0.8 1.0

Data(x)

a + cx2

a + sin( bx + c )

Figure 4a. Two il l-f it curve s.


26/47


un available for ar bitra ry fitting. The problem of th e present pa per is th at we

can not know in advan ce what th e relation will be, for t hose tha t r emain free to fit.

Clear ly, ignoring th e nu mber of par am eters in a goodness of fit t est will make t he

test value of chi squa re such t ha t df df test categor ies= , so th at , for t he sa me curve fit an d

dat a, a ccepta nce of the n ull hypoth esis (th at th e fit is good) will be favored.

However, as above, merely subtra cting th e nu mber of par am eters from ( Ncategories 1)

would seem likely t o under corr ect dfexcept in th e special case of linear regress ion.

We seek a wa y of corr ectin g th e dfin a wa y equally fair , in some sen se, to all

curves to be fit.


27/47


III. The P aram ete r Correc tion

Defin i t ion of Cor r ect ion

Development and Rationale

Before st ar ting, we emphasize tha t we do not addr ess th e question of

cat egorization of th e data : Pr esuma bly, an optimum categorization a lready has

been achieved, or perh aps th e problem may not be flexible in these ter ms. We

accept t he categories of the da ta set a s a given.

We also accept a s fixed a priori, the level of significance p to be requ ired for a

rejection of goodness of fit.

A clue to th e solution su ggested her e ma y be seen in th e example in Figure 4babove: In th at example, th e data residua ls were just as ran dom as th e origina l dat a

un less fit by a stra ight line. If th e residuals were fit by a stra ight line, the slope

would be 0 and t he int ercept would be 0, becau se th ese ort hogona l componen ts wer e

th e ones subt ra cted down to 0. But, th ey were the only ones subt ra cted away, and

no oth er stru ctu re in the dat a was affected. If we wished to test a linear fit to the

residuals by chi squa re, we would subt ra ct a way df= 2; however, we assu me h ere

th at to test a tr igonometric fit, we would subt ra ct a way nothing from th e df. Even

th ough t he varian ce of th e residuals in Figure 4b was less tha n t ha t of th e origina l

dat a, th e residuals rema in just a s dimensionful as th e data , except as projected on

th e intercept or t he slope of a linear basis set.

Given cat egories C, NC in n um ber, we consider a set of ar bitrar y cur ves K to fit,

each cur ve itself with a n a rbitra ry an d probably not u nique set of par am eters NP in

nu mber . All such cur ves ar e ass um ed single-valued on th e doma in of th e cat egories

of dat a. The quest ion is how to coun t th e dfto be used in a chi squa re test. This

quest ion is equivalent to th e one of coun tin g how ma ny dfshould be subtracted from

th e number of dat a categories. We know th at for rea listic problems we mu st ha ve

N df NP C< < .

The a nswer we propose her e is to accoun t for t he diversit y of possible par am eter s

by ignorin g th eir num ber an d consider ing only their effect. We th erefore propose to

countth e degrees of freedom in th e dat a by measu ring th e ort hogona lity of th e dat aafter t he curve has been fit.

Here is how we prepare th is quant ificat ion: For data on ra ndom var iableXwith

domain { }x , described categorized by some function, { }Data x , we can compu te t he

total varian ce in the da ta cat egories, ( )V Datacat . We choose th e curve K to be fit, fit

it, subtra ct it, and th en exam ine the residual var iance, ( )V Data K res , .


28/47


Under th e null hypoth esis tha t t he fit was good, the residua ls should represent

solely ra ndom var iat ion; th ey of cour se ma y not be as sum ed constr ain ed by K, K

ha ving been removed. Under th e nu ll hypoth esis, we assu me here th at th e

ort hogona lity of th ese residu als m ay be seen as being equa l to th e degrees of

freedom of th e cur ve fit t o th e origina l dat a, except wh en consider ed in r egard t o afit by K itself.

Coun tin g Pr ocess

So, having r emoved t he effect of th e curve K, we propose to count degrees of

freedom by r emoving furth er var iance by fitting t he r esiduals with th e sum of a set

( ){ }L L x of linear ly ort hogona l fun ctions , leaving us with a n ew residua l var ian ce,

( )V Data K Lres , , . Sta tist ical significan ce of th e sum will be test ed after a dding each

term ofL to it; any term removing statistically significant variance will be omitted

from L .

The size of th e set L th en will define th e dfof the chi square t o be used in t esting

goodness of fit. We see no reas on n ot t o use a n ort hogona l set of polynomial

functions; because we will enumerate terms in L beginning with L0 , we call t he s ize,

n df+ =1 , the nomial of th e fit:

( ) ( ) ( ) ( )df Data K L n L n R Data K L L xcorrected jj

n

= + ==

nomial , , , , ,1 00

s. t. es , (25)

in which t he j ta ke on values of th e degree ofL not foun d sta tist ically significan t.

To see th e logic of this, if only a few terms in L were required to reduce theresidual varian ce ( )V Data K Lres , , below some th resh old , then th e residuals must

ha ve had low-order str uctur e which was m issed by K; so, th e dat a cat egories were

not orth ogona l in view of th e th eory an d indeed sh ould be test ed at lower df. On

th e oth er ha nd, if nu merous term s were required in L to reduce the r esidua ls below

, the r esiduals must ha ve had enough ort hogona lity to be tested at high df.

St opping Cr iterion

The one rema ining pr oblem is the st opping crit erion: How good a fit should we

requ ire of th e ort hogona l fun ctions ( ){ }L x , none of which being significant?

We find an an swer by again referring to the n ull hypoth esis tha t t he fit is good:

Under th is hypoth esis, we assume t ha t t he initial cur ve-fit by Kmu st ha ve removed

a st at istically significan t a moun t of var ian ce; if so, ( )V Datacat > ( )V Data K res , ; in fact,we may write this in para metric terms a s,

( ) ( )V Data F V Data K cat p res , , (26)


29/47


in wh ich Fp is variance rat io at th e one-tailed p-level for t he F isher an alysis of

variance F-test. So, to calculat e where to term inat e th e nomial, we suggest a pplying

th is criterion:

( ) ( ){ } ( ) ( )nomial , , , , , ,maxData K L Min n L V Data K L F V Data K Lres p res n 1 0s. t. . (27)

This just sets a t hr eshold which pr events th e creat ion of nonexistent dfby the

ort hogona l fun ction fitt ing process: If th e n -th polynomial leaves th e n -th residuals

with significan tly less variance tha n t he original 0-th residuals, th en a ll degrees of

freedom (beyond th e st ru ctu re evident ly overlooked by K in the dat a an d now foun d

in th e original r esidua ls) ha ve been discovered.

A similar rationale is used sometimes in factor analysis (as distinguished from

principle components an alysis). It should be ment ioned tha t th e dffor t he

significance level of F is not a difficult issue h ere, becau se we ar e looking a t

residuals a fter our problematical cur ve Kha s been fit and gone. The df for

computing ( )V Data K Lres n, , from a sum of squa res of deviations will be just t he dfofthe data , ( NC 1), minu s the polynomial const ra ints ( n + 1 ). This is becau se, in both

cases, the r esidual variance will have the dimensiona lity of th e dat a u nder t he nu ll

hypoth esis tha t t he fit ofKwas good an d th erefore t ha t L will be in effective in

explaining the data.

So, we ha ve descr ibed a pr ocess to redu ce an ar b i t ra ry problem in

coun t in g p h ysica l p a r a m et er s a s d f, t o one of cou n t in g l inea r ly in d ep end en t

p olynom ia l p a r a m et er s a s d f.

The Corr ection Pr ocedur e

In the following, ( )Vres refers to a variance between categories and is computed

from a sum of squar es by dividing by a dfdependin g on t he nu mber of cat egories

an d their linear in dependence. The sample var iance in the origina l dat a, as

categorized, is ( )V Datacat an d is a varian ce am ong t he category m eans; in cont ext of

an alysis of varian ce, this is called th e "between tr eatm ents" varian ce.

We resta te t he pr oblem: To perform a chi squa re goodness of fit test in th e

cont ext of a ph ysics experiment, a ssum ing predeterm ined dat a cat egories Cand

significance levelp , the degrees of freedom dfma y have to be adjusted for t he free

par am eters in the cur ve to be fit. The proposed an swer is:

1. Fit th e cur ve K to th e cat egorized dat a. Record th e sum of squa res of th e

residu als (differen ces or r ema inder s) after t he fit, stan dar dizing each differen ce by

th e var ian ce in its cat egory. The resu lt will be th e valu e of chi squa re for th e fit:


30/47


( )Data j

j

Nj j

jj

Nj j

jj

N

SSX K

V

X K

sd

C C C

2

1

2

1

2

1

= =

=

= = =

. (28)

The stan dar d deviation sd here is the st an dar d deviation of th e j-th cat egory mea n,also called the st an dar d err or, which sh ould be estimat ed from th e origina l data

before th e fit by K. Note th at t he cat egory variance corr esponding to chi squar e,

which will not be u sed in t he goodness of fit t est, is obtain ed by dividing chi squa re

by th e df: ( )V Data K df res Data, = 2 , in which we a pproximat e th e dfconventionally by

N NC P 1 .

IfK is a polynomial, count the number NP of free pa ra meters in t he polynomial

an d comput e df= ( NC 1) NP : This immediat ely is th e dfsought. IfK includes

polynomial free parameters (a constant offset, for example), subtract the number of

th em before per form ing th e goodness of fit t est.Also, esta blish a st opping criterion by looking up or calculat ing th e Fish er

varian ce ra tio value Ffor one-ta iled rejection of a n ull hypoth esis tes t ofdfof

( )V Datacat vs. dfof ( )V Data K res , at significan ce levelp . This $Fp ma y be foun d in

sta tist ics references in th e cont ext of an alysis of var ian ce. The significan ce test will

not a ctu ally be perform ed; however it s dfwill set $Fp for th e stopping crit erion. One

dfwill be th at of th e data , NC 1; th e oth er will be th at of th e data with const ra ints

of th e curve fit Kuncorrected, N NC P 1 . So, find,

( ) ( )( )F df df F p V Data V Data K p,

$

, . (29)

Note th at in general Kwill ha ve removed consider able var ian ce from th e dat a, so

an immediate Frat io test with,

( )

( )F

V Data

V Data K

cat

res

=,

, (30)

easily might exceed $Fp . This will be irr elevant , becau se (a) we ar e not merely

testing whether Ksays someth ing about t he dat a; and, (b) we can not assum e tha t

th e origina l var ian ce over th e whole set of cat egories was const an t.

2. One sum a t a t ime, progressively fit th e residual data with th e sum of th e

ter ms from an ort hogona l polynomia l set ( ){ }L x an d comput e the new residual

varian ce from tha t fit, ( )V Data K Lres , , . When th e nomial equals NC 1, st op. Or,stop when t he st opping crit erion is invoked: This will be when t he ra tio of var ian ces

am ong elements ofL no longer is below $Fp , na mely, when


31/47


( )( )

FV Data K L

V Data K LF

res

res n

p= , ,

, ,$

0. (31)

For everyL yielding a statistically significant fit, there was a linear dependence, sosubt ra ct 1 from t he n omia l for ea ch su ch fit.

To compu te t he first an d all subsequent values of ( )V Data K Lres , , , compu te t he onlyvarian ce present, na mely that between t he cat egories:

( )V Data K Lres n, , =

1

1dfresj

j

NC

S S , (32)

with ( )df N nres C = +1 1 , for Ln the highest-degree term in a polynomial ofn -th

degree.

3. Fina lly, to test t he goodness of th e fit, perform a st an dar d chi squar e test with

dfequal to th e nomial. In cases not requiring a corr ection, this dfwill equa l NC 1 .

Step 2 m ay be perform ed expeditiously by doing a pa ra met ric multiple regress ion

of the residuals ( )V Data K res , on a convenien t set of ort hogona l polynomials. AChebyshev set of coefficient s, or a ny of severa l other s, ma y be foun d in [8] and used

with P C softwar e.

Ap p l ica t ion in S yn th et ic Exa m p les

The value of using synthet ic examples is th at th e un derlying, corr ect fun ctiona lform of th e data will be known. Thu s, we can pr edict how a good chi squa re

goodness of fit t est should beh ave.

To insist on st at ing th e obvious, in th e following examples, compa red with th e

int roductory ones above, we ha ve a qua litat ively differen t set of dat a. The next

dat a below ar e simulat ed experiment al data , with real varian ce. Uniformly

distributed ra ndom numbers were used to add th e var iance.

In th e following exam ples, we sha ll assum e th e centr al limit t heorem so as t o be

able to tr eat each bin mean as norma lly distribut ed. In th e first example below, we

sha ll be comput ing th e sta nda rd deviation in each bin u sing 14 -1 degrees of

freedom; one degree is lost becau se th e bin mea n was estima ted from t he sa me 14

dat a. Knowing each bin mean and sta nda rd deviation, we sha ll comput e th e

sta nda rd deviat ion of each data bin mean (stan dar d error of th e mean) and u se it,

not t he corr esponding th eoretical fit pa ra meter, t o stan dar dize each element of th e

chi squar e sum.


32/47


Fifty Cat egories in Ran dom Ch i Squar e

Quadr at ic Fit

Recall from F igure 4 th at th e qua dra tic fit with 50 cat egories visibly was notgood. Let's look at t he same dat a cat egories, but with th e data r an domized by

add ition of some va ria nce.

Distance x

Field

IntensityI

-0.4

-0.2

0.0

0.2

0.4

0.6

0.8

1.0

1.2

0.15 0.50 0.85

Mean sd of I

Mean sd of Imean

Bad Line:

I = 7(x -1/2)

Cubic Fit:

I = 25| x -1/2|3

Quadrat ic Fit:

I = 7(x -1/2)2

Quartic Fit:

I = 70(x -1/2)4

Figu re 5. Fifty-catego ry fits by eye. The 700 artificial data of Fig. 4

here are added som e random error and again f it wi th the same

categorized quadratic, ( ) ( )I a x b x= = 22

7 1 2 . A purpose ly bad l inear

fi t also is show n, as is the old cubic f i t by eye from Figure 1 above an d a

new quart ic f i t .

Computing Data2 as in (5) or (28) above yields a valu e of about 35.8 for t he s olid

line in F igur e 5, which r epresents t he qua dra tic cur ve. We know from published

ta bles or calculat ion th at ( ).0012 13 = 34.5 an d ( ).001

2 14 = 36.1. We find in th e tables,

also, th at our st opping crit erion will be ( )F. , .001 49 47 250= . The quadr at ic ha s just two


33/47


polynomial par am eter s, so our corr ection should allow us imm ediat ely to assign a

corrected dfof ( NC 1) - 2 = 47; however, we proceed with t he complete correction

an alysis, to show how it work s.

If we compu ted our correction t o df= 14 or a bove, and if no ter m wa s significan t,

an d no residual reached the L Ln0 var ian ce ra tio stopping crit erion of 2.50, we

would have included sufficient dfto know we could a ccept t he n ull hypothesis t ha t

th e fit was good.

The resu lts of th ese compu ta tions ar e in Table 5. To redu ce possible err ors on

th is first real t rial of th e corr ection, th e ta ble was const ru cted by repeat edly fitting

nonorth ogona l polynomials with t erm s of th e form, a xnn , as described for t he

correction above. The resu lt was verified by repea tedly run nin g mu ltiple regression

on a n in crea sing nu mber of ter ms of th e Chebyshev ort hogona l polynomia ls of th e

first k ind. In pr actice, of cour se, only the regr ession would be necessar y.

The low, relatively unchanging $F ra tios for t he lowest-order nomials in Ta ble 5ar e a good indicat ion t ha t we should test a t h igh df.


34/47


Table 5. Correcte d d fcalculations for the 50-category qu adratic in

Fig. 5 . All but the rightm ost two columns we re done by curve-fi t with

nonorthogonal a xnn polyn omia ls, term by term . The "MR"column s

we re done by mult iple regress ion on Chebyshev polynomials .

Ra w

Nom-

ia l

DescriptionPoly

Residual

SS

Poly

Residual

Variance

$F

MR

Residual

SS

MR

Residual

Variance

- Data2

= 35.82 - - - -

0 ( )V Data K Lres , , 0 0.4725 0.00964 - 0.4725 0.00964

1 ( )V Data K Lres , , 1 0.4695 0.00978 0.985 0.4695 0.00978

2 ( )V Data K Lres , , 2 0.4130 0.00879 1.097 0.4128 0.00878

3 ( )V Data K Lres , , 3 0.4127 0.00897 1.075 0.4127 0.00897

4 ( )V Data K Lres , , 4 0.4082 0.00907 1.063 0.4079 0.00906

8 ( )V Data K Lres , , 8 0.3643 0.00888 1.085 0.3613 0.00881

20 ( )V Data K Lres , , 20 0.3597 0.01240 0.778 (n ot don e) (n ot don e)

None of th e fits wa s significan t, a nd a ll compu ted va lues of $F sta yed well below

2.50, so we ar e justified in t estin g at ( )p2 20 or perh aps higher df. We kn ow

alrea dy that we can not reject t he nu ll hypoth esis at df = 14, so an y higher dfsurely

will yield a sta tist ic sufficient t o th e sam e conclusion . Fu rt her a na lysis is not

requ ired, an d accordin g to our corr ected chi squa re t est, we accept t he fit of th e

quadratic as good.

It sh ould be ment ioned her e th at a least -squar es or other similar fit of a 20-term

polynomial techn ically is not tr ivial, even on a fast compu ter . It would be ra th er

amazing that such a polynomial really could be fit optimally to a set of data small

enough to appr oach the num ber of term s being fit. But, we assu me the best and

look t o th e fut ur e for better .

Linear Fit

Table 5a sh ows th e sam e test as Ta ble 5, but for t he obviously bad fit of a st ra ight

line passing on a st eep slope thr ough t he middle of th e data (Figure 5). Again, the

two polynomial parameters should allow immediate correction to df= 47; we

cont inue an yway, for demonst ra tion pur poses.


35/47


Table 5a. Computation s as in Table 5 for corrected ch i square

goodne ss of fi t test of the bad ( ) ( )I a x b x= = 7 1 2 l ine in Fig. 5. Thestop on the third row makes the n omial = 1; how ever , the remainder i s

supplied to show the result of a bad f i t . Terms s ignificant in regression

are shown in parenthe ses .

Ra w

Nomial

Descr ipt ion Residua l

SS

Residual Variance$F

- Data2

= 106.85 - -

0 ( )V Data K Lres , , 0 101.58 2.073 -

1 ( )V Data K Lres , , 1 2.8708 (*L 0,L 1) 0.05981 34.7

2 ( )V Data K Lres , , 2 0.4128 (*L0-L 2) 0.00878 236.1

3 ( )V Data K Lres , , 3 0.4127 (*L 0,L 1) 0.00897 231.1

4 ( )V Data K Lres , , 4 0.4079 0.009064 228.7

8 ( )V Data K Lres , , 8 0.3616 0.008819 235.1

12 ( )V Data K Lres , , 12 0.3607 0.009749 212.1

As before, ( )F.

, .001

49 47 250= . But, as obvious in Table 5a, the bad linear fit has

left t oo mu ch st ru ctu re un accoun ted an d push es $F immediately out to 34.7, mak ing

th e nomial 1; this would imply testing a gainst ( ). .0012 1 108= . In an y case, becau se

( ).0012 49 is only about 85, th e Data value of 106.85 quickly allows us to reject the fit

of the bad line as not good no mat ter what th e df.

Cubic Fit

Once more, we look at a differen t curve fit to the da ta in Figur e 5: This tim e, it is

th e cubic from Figur e 1, replotted in F igure 5 an d origina lly rejected a s a fit to the

dat a before ran dom variat ion ha d been added. We again proceed with the ana lysis,

alth ough t wo polynomial para meters permit an immediate corr ection t o df= 47.

The raw Data2 for t he cub ic fit is foun d t o be 64.25; because ( ).001

2 49 is about 85.3,

th e dat a va lue from th e cubic might a llow th e fit t o be rejected, if our corr ection

were to reduce th e dfby enough. Fr om published ta bles or calculat ion, we find th at

( ). .0012 33 63 9= and ( ). .001

2 34 65 2= ; so, if we ret ain a corr ected dfof less t ha n 34, we

will reject the fit of the cub ic as not good.


36/47


As before, the correction stopping criterion will be ( )F. , .001 49 47 250= . The resultsof th e an alysis ar e in Table 5b.

Table 5b. Computation s as in Table 5 for corrected ch i square

goodn ess of f it test of the cubic ( )I a x b x= = 33

25 1 2 in Fig . 5.

Terms s igni ficant in regress ion are show n in parentheses .

Ra w

Nomial


SS

Residual Variance$F

- Data2

= 64.25 - -

0 ( )V Data K Lres , , 0 0.834828 0.017037 -

1 ( )V Data K Lres , , 1 0.780223 0.016255 1.0482 ( )V Data K Lres , , 2 0.479143 (*L 0-L 2) 0.010195 1.671

3 ( )V Data K Lres , , 3 0.476474 0.010358 1.645

4 ( )V Data K Lres , , 4 0.395849 0.008797 1.937

5 ( )V Data K Lres , , 5 0.394304 0.008961 1.901

7 ( )V Data K Lres , , 7 0.364182 0.008671 1.965

8 ( )V Data K Lres , , 8 0.360949 0.008804 1.935

12 ( )V Data K Lres , , 12 0.360192 0.009735 1.750

20 ( )V Data K Lres , , 20 0.352694 0.012162 1.400

Alth ough Ta ble 5b omit s some result s, th e value of $F wandered up to the

ma ximu m sh own a t a nomial of 7, th en declined gra dua lly to the n omial 20 value

shown; this was the limit of th e aut hor's PC softwar e. There is no sign th e stopping

crit erion would be invoked, an d only one n omia l ter m wa s dr opped becau se of

significan ce, so the corr ected chi squa re dfwould be NC =1 1 2 46 , the fina lredu ction by two being becau se the fit was a t wo-par am eter polynomia l. We repea t

that the immediate correction, because of the two polynomial terms, by subtraction

of 2 from 49, would leave a proper corrected dfof 47; in effect, we pr eten ded t her e

was at least one nonpolynomial para meter t o test t he procedure. But, 46 34>> still

mean s th at th e cubic fit to th e Figure 5 da ta would be accepted as good.


37/47


Quar tic Fit

Let's look once again a t a cur ve fit to th e data in Figure 5: This time, it is the

qua rt ic, which wa s fit by eye.

The raw Data2 for th e quar tic fit is foun d to be 83.01. Fr om published ta bles or

calculat ion, we find t ha t ( ). .0012 47 82 7= and ( ). .001

2 48 84 04= ; so, if we ret ain a

corrected dfof less tha n 48, we will reject t he fit of th e quar tic as n ot good. The

common practice in (24) above of setting dfat N NC P 1 , or our str ictly app lied

correction procedure, would assign df= 47, cau sing an imm ediat e rejection of th e fit.

We pretend th ere is at least one nonpolynomial par am eter a nd pur sue our corr ection

here t o see how it perform s in t his int eresting ma rginal case.

As before, the correction stopping criterion will be ( )F. , .001 49 47 250= . The resultsof th e corr ection a na lysis is in Table 5c.


38/47


Table 5c. Computation s for corrected chi square goodne ss of f i t test

of the quartic ( ) ( )I a x b x= = 44

70 1 2 in Fig. 5 . Terms s ignif icant in

regress ion are show n in parentheses .

Ra w

Nomial


SS

Residual Variance$F

- Data2

= 83.01 - -

0 ( )V Data K Lres , , 0 0.824809 (*L 0) 0.016833 -

1 ( )V Data K Lres , , 1 0.766592 0.015971 1.054

2

( )V Data K L

res

, ,2

0.658320 0.014007 1.202

3 ( )V Data K Lres , , 3 0.650998 0.014152 1.189

4 ( )V Data K Lres , , 4 0.407871 (*L 0-L 4) 0.009064 1.857

5 ( )V Data K Lres , , 5 0.406808 0.009246 1.821

6 ( )V Data K Lres , , 6 0.383786 0.008925 1.886

7 ( )V Data K Lres , , 7 0.366298 0.008721 1.930

8 ( )V Data K Lres , , 8 0.361569 0.008819 1.909

12 ( )V Data K Lres , , 12 0.360525 0.009744 1.728

20 ( )V Data K Lres , , 20 0.352887 0.012169 1.383

As sh own in Table 5c, th e value of $F wandered up to a ma ximum again at a

nomial of 7, but t he st opping crit erion was not invoked. However, th e low-degree

an alysis was enough to reveal two terms of significan t residuals; the pr etend-

corr ected chi squa re dfwould be NC =1 2 2 45 , and t he quar tic fit to the Figure

5 dat a is r ejected a s not good. Actu ally, we of cour se would h ave rejected t he fit at49 - 2 = 47 df, becau se th ere were just t wo polynomial para meters an d no oth ers.

Fina lly, before leaving th is exam ple, we should rema rk t ha t t he sam e quadr at ic

function wh ich failed misera bly in F igure 4 ea sily was s hown a good fit in Figur e 5.

The dispersion in the Figur e 5 dat a clearly weakened the test. In fact, even th e

cubic of Figur e 1 was a ccepted a s a good fit. When t he ra ndomn ess is grea t enough,

alm ost a ny fit will be good en ough --but , not m uch confidence in t he h ypoth esis will

be added by not rejectin g it, eith er.


39/47


We consider next a more realistic simulated example.

Applicat ion in a Twenty Cat egory Decay Simu lat ion

Assume t here is a t heory which predicts t ha t a certa in ma teria l, initially

tr an slucent, will fluoresce if illumina ted with light at a certa in, short wa velength.Let's consider a h ypoth etical experim ent in which some of th is mat erial is enclosed

in a spherical detector, a glowball, and irradiated with a pulse of short-wavelength

light . We wish to predict the tim e cour se of th e glow inside th e sphere.

The hypoth esis is tha t a 4 Pla nckian detector will respond a s th e convolut ion of

th e Gaussian light exciting impulse with a decaying negative exponent ial of the

form,

( )( ) ( )

E t w a I I d e e

tw t a; , | 0 0

0

1

2

2

=

, (33)

which h as t wo free par am eters, pulse width w an d time-const an t a, the initial

intensity I0 being fixed by theory.

The resu lt of th e simulat ed experiment is given in Figure 6.


40/47


time t

Energy

E

0

5

10

15

20

25

30

0 5 10 15 20

Mean sd of E

Mean sd of Emean

Figure 6. Glowb all decay expe riment. Time t and energy E are in

arbitrary un its . The sol id line represe nts a fi t of formula (33) above,

wi th I0 = 8 f ixed by the ory and w = 2.9 an d a = 17 f i t by eye to the 20 bin

means s hown.

Fr om published ta bles or calculat ion, we find th at ( ). .0012 19 438= and

( )F. , .001 19 17 4 81= . Computing Data2 , we get over 75. Ther efore, th e null hypoth esis

of a good fit is r ejected at once.

The analysis, although unnecessary for this particular decision, is given in

Table 6. Looking at t he several low-degree valu es ofL in t he t able, it is obvious

th ey are chan ging slowly, an d, clear ly, after compu tin g L2 , it would be very

reasonable to conclude t ha t a test at df= NC 1 1 was just ified.


41/47


Table 6. Glow ball decay expe riment. Computation s for correcte d

chi square goodne ss of fi t o f the c urve in Fig . 6, which w as text equat ion

(33) f i t by eye . Two free paramete rs , w an d a , were avai lable to f i t .

Terms in parenthes i s w ere s igni f icant in regress ion.

Ra w

NomialDescription

Residual

SS

Residual

Variance$F

- Data2

= 75.7 - -

0 ( )V Data K Lres , , 0 9.551 (*L 0) 0.503 -

1 ( )V Data K Lres , , 1 9.551 0.531 0.947

2 ( )V Data K Lres , , 2 7.631 0.449 1.120

3 ( )V Data K Lres , , 3 6.493 0.406 1.239

4 ( )V Data K Lres , , 4 6.226 0.415 1.212

8 ( )V Data K Lres , , 8 5.432 0.494 1.018

12 ( )V Data K Lres , , 12 5.559 0.794 0.634

None of th e compu ted va lues of $F exceeded even 2, alt hough t he L0 term would

not be coun ted; so, if the initia l value of 75.7 for chi squa re ha d n ot been so obvious,we would have performed the corrected test against ( ) ( ) . .001

2

001

21 1 18NC = .


42/47


But wha t if th eory allowed us t o adjust t he th ird para meter , I0 , for a bett er fit?

The result of a t hr ee-par am eter fit is shown in Figure 7 an d is ana lyzed in Ta ble 7:

t ime t

Energy

E

0

5

10

15

20

25

30

0 5 10 15 20

Mean sd ofE

Mean sd ofEmean

Figure 7. Glow ball decay expe rimen t. Analysis w ith three freeparameters . Time t and energy E are in arbitrary units . The sol id l ine

represe nts a f i t of formula (33) above, with I0 = 8.2, w = 2.95, an d a = 1 7

as f i t by eye a gain to the sa me 20 bin mea ns as in Fig. 6 .

As sh own below in Table 7, Data2 , with an additiona l free par am eter, now is just

34.9, which is below ( ). .0012 19 438= . Ther efore, assum ing a corr ection for free

par am eter s, no immedia te decision can be ma de. Becau se th e lowest dfallowing

accepta nce of the fit is 14 ( ( ). .0012 14 361= ), we seek a decision a s to wheth er t he df

might be 14 or more.

The corr ection an alysis is done in Table 7. In t ha t Table, th e wandering of th e

residual va rian ce an d t he lack of consistent increase of $F make it quite certain t hat

th e critical Fvalue of 4.81 never will be reached. Ther efore, th e fit of th e theory

using th ree free par am eters m ay be accepted as good.

Compar e the a ppeara nce of th e two fits in Figures 6 and 7: The eye can ha rdly

decide, whereas th e sta tistics a re u nequivocal, given t he category set an d th e

significance level.


43/47


Table 7. Glow ball decay data as in Table 6. Computations for

corrected ch i square goodness of f i t of the curve in Fig. 7 . Three

parameters ,I0, w , and a , were free to f i t .

Ra w

Nomial


SS

Residual

Variance$F

- Data2

= 34.97 - -

0 ( )V Data K Lres , , 0 9.179 0.4831 -

1 ( )V Data K Lres , , 1 9.173 0.5096 0.948

2 ( )V Data K Lres , , 2 8.868 0.5216 0.926

3 ( )V Data K Lres , , 3 6.622 0.4139 1.167

degrees of freedom: a correction to chi-square for physical hypotheses

Documents