dawngriffiths - gbv.de · geometrie, binomial and poisson distributions. popcorn machine xvi...

18
Head First Statistics Wouldn't it be dreamy if there was a statistics book that was more fun than an overdue trip to the dentist? But it's probably just a fantasy... Dawn Griffiths Q 'REILLY® Beijing Cambridge Köln Sebastopol Taipei Tokyo

Upload: others

Post on 30-Aug-2019

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DawnGriffiths - gbv.de · geometrie, binomial and Poisson distributions. Popcorn machine xvi Drin'ks machine We necd to lind Chads probabiliry distribution There'sa patrern ro rhis

Head First Statistics

Wouldn't it be dreamy ifthere was a statistics book thatwas more fun than an overdue trip

to the dentist? But it's probablyjust a fantasy ...

Dawn Griffiths

Q'REILLY®Beijing • Cambridge • Köln • Sebastopol • Taipei • Tokyo

Page 2: DawnGriffiths - gbv.de · geometrie, binomial and Poisson distributions. Popcorn machine xvi Drin'ks machine We necd to lind Chads probabiliry distribution There'sa patrern ro rhis

table of contents

Table of Contents (Summary)Int ro XXVII

1 Visua lizing In for m ation : First Impressions

2 M easuring C en trat Tenden cy: The l'vfiddle Wrlj' 45

3 Measuri ng Sp read: Poicer Ranges 83

4 C alculat ing Pro ba bilities: Taking Chanres 127

5 Discrete Proba bility Distri butions: Ma nage l'jllr Expectatious 197

6 Permuta tions and Combi nation s: Making Artangements 24 1

7 G eometrie, Binom ial, a nd Po isson Di stribution s: Kee/Jillg Things Disa ete 269

8 Nor m al Dist ribution: BeiugNormal 325

9 No r mal Di stribut ion Part I I: Beyond Normal 36 1

10 l Jsing Starisrical Sampling: 7äking Sampies 41 5

11 Estimating Your Popula tion : /vlakillg Predutions 44 j

12 Co nsrrucring Confidcnce lnrervals: Gucssing tritl: COlifide/lrf 487

13 U sing Hyporhesis Te sts: Look at the Emdeure 52 1

14 The Chi Sq uare Distribution: There's SOIl/e1hillg Goiiu; Oll 567

15 Correlarion a nd Re gression : Whal's kl)' Lilie? 605

Append ix i: Top Tell Thillgs We ou« Cover 643

11 Ap pendix ii: Statistus Tables 65 7

Table of Contents (the real thing)Intro

Your brain on statistics. Here yau are trying to learn something, while

here your brain is doing you a favor by making sure the learning doesn't stick. Your

brain 's thinking, "Sett er leave room for more important things, like which wild

animal s to avoid and whether naked snowboarding is a bad idea" So how da you

trick your brain into thinking that your life depends on knowing statistics?

Who is thi s book for?

We know wh at you 'r e th inking

M etacognition

Bend your brain int o subrnission

Read me

The technical review team

Acknowledgm ents

XXVII I

XXI X

XX X I

XXXlll

XXX IV

XXXVI

XXX VII

ix

Page 3: DawnGriffiths - gbv.de · geometrie, binomial and Poisson distributions. Popcorn machine xvi Drin'ks machine We necd to lind Chads probabiliry distribution There'sa patrern ro rhis

table of contents

1v'lsual'lz'lnh' Int9rmettl9n

First Impressions

Can't tell your facts from your figures?Statisties help you make sense of confus inq sets of data . They make the

eomplex simple. And when you've found out what's really going on, you

need a way of visualizing it and telling everyone else. So if you want to

piek the best ehart for the job , grab your eoat, paek your best slide rule, and

join us on a ride to Statsville .

Staristics <I re eve rywhere

But why learn staristics?

A tale of two charts

2

3

4-

10

10

11

12

13

14

18

19

39

3

34

20

26

'2 7

28

35

Company Profit per Month

Deali ng with grouped dat a

M ak e a histogram

Step I: Find the bar width s

Step 2: Find the bar heigh ts

Srep 3: D rall' your chart

Inr rodu cing cumularive lrequcu cy

Dr awing the cumulative Irequency gr aph

Choosin g rhe right charr

The hum ble p ie cha rt

Bar cha rts ca u a llow for more accura cy

Verrical ba r charts

H orizont al bar cha rts

It's a matt er of scale

U sin g Irequency scale s

DeaJing with multiple sers of data

Catcgories vs, numbers

~ 2 S.!!"0

2.',...0.. 2.3c

E 2.2

!. 2.1jEe 2.0"- Jul Aug Sep Oe' Nov Oe<

Month

See whot r meon , t heprof it 's obout t hesame ea ch month.

No, th isprof it's omoz ing.Look ot it soor!

Ju l Aug Sep Oc t Nov Dec

Montn.--~-_

Company Profit per Month

1

~ 2S~"0

2.0,...0.. 1.5c

~ 1.0

!. 0.5jE0 0 .0.t

x

Page 4: DawnGriffiths - gbv.de · geometrie, binomial and Poisson distributions. Popcorn machine xvi Drin'ks machine We necd to lind Chads probabiliry distribution There'sa patrern ro rhis

2

table of contents

meQsurlng centraI tendencj

The Middle Way

Sometimes you just need to get to the heart of the matter.It can be difficult to see patterns and trends in a big pile of figures, and finding the

average is often the first step towards seeing the bigger picture . With averages at

your disposal, you'll be able to quickly find the most representative values in your

data and draw important conclusions. In this chapter, we'lIlook at several ways to

calculate one of the most important statistics in town-mean, median, and mode­

and you 'lI start to see how to effectively summarize data as concisely and usefully

as possible.

Welcom e to the H ealth Club

t\ com mon m casur c 01' average is the me a n

Mean ma rh

Dealing wirh unknowns

Back to the rn can

Back 10 the Hcalth Club

Everybod y wa s Kung Fu figh ting

G ur da ta has outli crs

Th e ou tliers did it

\Valercooler conversation

fi nd ing thc median

H ow to find the median in three step s:

Business is boornin g

The Little D ucklings swimming class

\Vhat wen t wrong with the rn ea n and med ian?

"\lh ill shou ld we do for dara like this?

The Mean Exposed

In trodu cing the mode

Th rce sreps Ior fincling the mod e

4·6

4·7

48

49

50

53

54

57

58

60

61

62

65

66

69

69

71

73

74-

xi

Page 5: DawnGriffiths - gbv.de · geometrie, binomial and Poisson distributions. Popcorn machine xvi Drin'ks machine We necd to lind Chads probabiliry distribution There'sa patrern ro rhis

table of contents

meQSur1nb VClrlqbllltj Clnd spreqd

Power Ranges

Not everything's reliable, but how can you tell?Average s do a great job of giving you a typical value in your data set , but they don't

tell you the full story . OK, so you know where the center of your data is, but offen

the mean , median, and mode alone aren 't enough information to go on when you're

summarizing a data set. In this chapter , we'lI show you how to take your data skills

to the next level as we begin to analyze ranges and variation .

All three plcvet-s hcvethe scme cvercqe sccrefor Shoot lng, but r need scmewcry of choosing betwee n th em.Think yau can he lp>

oo

xii

'NaIlted: oll e playcr

\Ve need to corn pare player scores

Use th e ran ge to clifle renr iare between ciat a se ts

T he prob lern wit h ou tliers

"Ve neecl to ge t awa y from outlie rs

Q uartiles co me to th e reseue

T he int erquart ile ra nge excludes outl icrs

Q uartile anatorny

We 're no t j us t lim iteel to qu a rtiles

So wh at a re percenril es?

Box a nd wh isker plot s le t you visua lize ranges

Variab ility is m ore rhan just sp read

C a lcula ting ave rage disra nces

We ca n ca lc ula te va riar ion with the va ria neo . ..

. , .but sta nda rd deviat ion is a rnore intuitive measure

Standard De via tion Exposed

A quicker ca lc ularion for va ria nce

Whar if we need a ba seline for compari son?

Us e standard scores to compare va lues across data sers

Int erp reting sranda rd sco res

S ratsville All Stars win the leag ue!

84-

85

8G

89

9 1

92

93

94­

98

99

JOO

104

105

lOG

107

108

11 3

118

11 9

120

125

Page 6: DawnGriffiths - gbv.de · geometrie, binomial and Poisson distributions. Popcorn machine xvi Drin'ks machine We necd to lind Chads probabiliry distribution There'sa patrern ro rhis

table of contents

4calcl.l lett1n,g pl'9bablllt1es

Taking Chances

Life is full of uncertainty.Somet imes it can be impossible to say what will happen from one minute to the

next. But certain events are more likely to occur than others, and that 's where

probability theory comes into play. Probability lets you predict the future by

assessing how likely outcomes are, and knowing what could happen helps you

make informed decisions. In this chapter, you'lI find out more about probabil ity

and learn how to take control of the future !

135

156

157

128

129

132

16 1

167

169

170

172

173

181

182

183

159

136

142

147

148

149

155

EH Da n's Gr and Siam

Roll lIjJ for roule tte!

What a re the chances?

Find rou lette proba biliries

Y OLI can visualize probabiliries wirh a Venn diagram

Y OLI can also add prob abi lities

Exclusive cvents and inte rsecring evenrs

Proble ms at th e int ersect io n

So me more notanon

Another unl ucky spin . . .

Conclitions apply

Find conditioua l pro babilities

Tre cs also hclp YOll ca lculare cond itiona l probabi liries

H audy hint s for workin g with trees

Step I : Finding P(Black n Even)

Step 2: Fineling P(Even)

Step 3: Find ing P(Black 1Even)

Use the Law of Total Probability 1O find P(B)

lnrrodu cing Bayes' Theore m

If events a ffecr eac h other, the y are dependent

rf evcn ts do not affecr each orher, they are indcpendenr

More on calcu lating proba bility for independcnt evcnts

sN

0 0 0 119- i6

xiii

Page 7: DawnGriffiths - gbv.de · geometrie, binomial and Poisson distributions. Popcorn machine xvi Drin'ks machine We necd to lind Chads probabiliry distribution There'sa patrern ro rhis

table ot contents

5uSlnh' dlscrete pr9bab'll!tj d'lstribut'i9n8

Manage Your Expectations

Unlikely events happen, but what are the consequences?So far we've looked at how probabilities tell you how likely certain events are. What

probability doesn't tell you is the overall impact of these events, and what it means

to you. Sure, you'lI sometimes make it big on the roulette table , but is it really worth it

with all the money you lose in the meantime? In this chapter, we'lI show you how you

can use probability to predict long-term outcomes, and also measure the certainty

of these predictions.

xiv

Back ar Fat Dan 's Casi no 198

We can compose a probab ility distr ibution for the slot machine 20I

Expect ation gives )'ou a pred iction of the results. . ; 20+

. . .and varia nre teils you about the sprcad o f the resul ts 205

Var iances and probabiliry distributions 206

Ler's calc ulare the slot machine 's varian ce 207

Fat Dan changed his prices 2 12

T herc 's a linear relationship between E(X) and E('r') 2 17

Siot machine transforrnation s 2 18

Gen eral forrn ulas lor linear tran sforms 219

Ever y pull of the lever is an ind ep cndent obscrvat iou 222

Obser vation sho rtcuts 223

New slot machine on rhe blec k 229

Adel E(X) and E(V) to ger E(X + V)... 230

... an d subtract E(X) and E(V) to get E(X - Y) 231

You can also add and subrract linear tran sformarions 232

] ackl)ot! 238

Page 8: DawnGriffiths - gbv.de · geometrie, binomial and Poisson distributions. Popcorn machine xvi Drin'ks machine We necd to lind Chads probabiliry distribution There'sa patrern ro rhis

6

table of contents

permutat19ns and c9mblnatl9ns

Making Arrangements

Sometimes, order is important.Counting all the possible ways in which you can order things is time

consuming, but the trouble is, this sort of information is crucial for

calculating some probabilities. In this chapter, we'll show you a quick way

of deriving this sort of information without you having to figure out what all

of the possible outcomes are. Come with us and we'lI show you how to

count the possibilities.

The Statsville Derby 242

It's a three-liorse race 243

How many ways can they cross the finish line? 245

Calculate the nurnber of ilrrangements 246

Going round in cireles 247

It's time Ior the novelty race 251

AlTiInging by individuals is different than mranging by type 252

We need to mrange animals by type 253

Ceneralize a lormula for alTanging duplicates 25't

It's time for the tweury-horse race 257

How milny ways can we fill the top three positions? 258

Examining perrnutations 259

What if horse order doesn'r matre I' 260

Exarnining combinations 26 J

Cornbination Exposeel 262

Does order rcally matter? 262

It's the enel of the ra ce 268

xv

Page 9: DawnGriffiths - gbv.de · geometrie, binomial and Poisson distributions. Popcorn machine xvi Drin'ks machine We necd to lind Chads probabiliry distribution There'sa patrern ro rhis

table of contents

7t:ß9metrlc. bln9mlctl . and p9'lss9n d'lstr'lbutl9ns

Keeping Things Discrete

Calculating probability distributions takes time.So far we've looked at how to calculate and use probability distributions, but wouldn 't it be

niee to have something easier to work with, or just quicker to calculate? In this ehapter,

we'lI show you some special probability distributions that follow very definite patterns .

Onee you know these patterns, you 'lI be able to use them to calculate probabilities,

expectations, and variances in record time. Read on, and we'lI introduee you to the

geometrie, binomial and Poisson distributions.

Popcorn machine

xvi

Drin'ks machine

We necd to lind Chads probab iliry d istr ibutio n

The re 's a pa trern ro rhi s p ro bability distr ibutio n

The p robabi lity di str iburion can be re p rese nred a lgc b rai ca lly

T hc ge o met rie d istr ibuti on a lso wo rks wit h ine quali iies

T he pattern of ex pccrat ions fo r the geo metr ie d istribu tion

Expecta tion is I1p

Find ing rhe var ia nce far our di st riburion

A quick gui de to th e ge o merric d istrib ution

Who Wilnt s to Win a Swivel Chair!

You 've masrered th e gcomerric distr ibution

Sho uld you play, 01' walk away?

Generillizin g the probabi liry fo r rhree questions

Ler's gc nera lize th e proba biliry further

Wha r's the expecrar io n a nd va riance?

Bino rn ial expecrat io n an d va riance

Your q uick gu ide to rhe biuornial d isrr ibu rio n

Expccrarion a nd va riance for the Po isson clisuibu rio n

So wha t's the proba biliry d istr ibutio n?

Combine Po isson va riables

The Poisson in d isg uise

Your q uick glli de ro the Po isso n distribution

273

274

277

279

280

281

283

284

287

287

291

293

296

298

301

302

308

312

313

316

31 9

Page 10: DawnGriffiths - gbv.de · geometrie, binomial and Poisson distributions. Popcorn machine xvi Drin'ks machine We necd to lind Chads probabiliry distribution There'sa patrern ro rhis

8

table 01' contents

lIS1nh' the norma] dIstrIbutlon

Being Normal

Discrete probability distributions can't handle every situation.So far we've looked at probability distributions where we've been able to specify exact

values , butthis isn'tthe case for every set of data. Some types of data just don't fit the

probabil ity distr ibutions we've encountered so far. In this chapter, we'lItake a look at

how continuous probability distributions werk, and introduce you to one of the most

important probability distributions in town-the normal distribution.

°0

Discrere da ra ra kes exact values...

,. ,but not all nu meric da ta is discrete

Wh ar's the delay?

\\Ie ueed a probabiliry distrib urion for continuous data

Probability dcn sity Iuncti ons can be used for continuous da ta

Probability == area

Ta calcularc probabiliry, star t by finding ~x).

.. .rhcn find probability by fillCling the area

We've found rhe probabiliry

Sear clting for a soul mar e

Male modclling

T he normal distribution is <I n " idea l" model for continuous dara

So how da we find nor mal probab ilities?

Thrce steps to ca lcula ring normal probabilirics

Srep I: Derermine your distrib ution

Srep 2: Staudardizc to N (O , I )

To sta ndardize, first move the mean.

... then squash rhe wid th

Now find Z Ior rhe spccific value you want to find probability Ior

Ste p 3: Look up the probabil ity in your ha ndy tab le

326

327

328

329

330

331

332

333

337

338

339

340

341

34·2

343

344

345

345

346

349

/

"--'./~

.,.-- ......

,"

.r:":

/,,

xvii

Page 11: DawnGriffiths - gbv.de · geometrie, binomial and Poisson distributions. Popcorn machine xvi Drin'ks machine We necd to lind Chads probabiliry distribution There'sa patrern ro rhis

table of contents

9uSlng the normal dlstrlbut19n II

Beyond Normal

If only all probability distributions were normal.Life can be so much simplerwith the normal distribution . Why spend all your time

working out individual probabilities when you can look up entire ranges in one swoop, and

stillleave time for game play? In this chapter, you 'll see how to solve more complex

problems in the blink of an eye, and you'lI also find out how to bring some of that normal

goodness to other probability distributions.

All aboard the Love Train 363

Normal br ide + normal g roo m 364·

!t's still j ust weight 365

How's th e comb ined weight dis trib uted? 367

Find ing probabil ities 370

M ore pe ople wa nt the Love Trai n 375

Lin ea r tran sforrn s descr ibe underl ying cha nges in va lues. . . 3 76

.. .a nd ind cpcndenr observarions dcscri bc how ma ny valu cs you have 377

Expc ctat ion and var iance for ind ependent ob servations 378

Should we play, 0 1' walk away? 383

Normal distribution to th e rcscu e 386

Wh en to approxirna re the b inom ial d istribu tion wi th the nor ma l 389

Rev isiring the normal approxi rna rion 394

L

_ - T he bino rnial is d iscrerc, but the normal is co nrinuous 395

Apply a conrinuity correc tion befor e cal cu la ting the a pprox imai ion 396

The No rmal Dist ribu tion Exposed 40 4

All ab oarr l the Love T rain 40 5

Wh en to approxirna te the binomia! clistributio u with rhe norma l 407

A runaway success! 4 13

xviii

x x+x~,}th ad.H. " J" ;"d<ft"d<"t."""rI~bo" ~ 'f.. .

~

x +x+x x+x+x+x

Page 12: DawnGriffiths - gbv.de · geometrie, binomial and Poisson distributions. Popcorn machine xvi Drin'ks machine We necd to lind Chads probabiliry distribution There'sa patrern ro rhis

10

table of contents

us'lnh' stat'ltltkaJ samPllnh'

Taking Sampies

Statistics deal with data, but where does it come from?Some of the time, data's easy to collect, such as the ages of people attending a health

club er the sales figures for agames company. But what about the times when data isn't

so easy to coliect? Somet imes the number of things we want to coliect data about are so

huge that it's difficult to know where to start . In this chapter, we'li take a look at how you

can effectively gather data in the real world , in a way that's efficient , aceurate, and can

also save you time and money to boot. Welcome to the world of sampling.

T he Mighty G umball taste tcst

T hey're run ning out of gumballs

Test a gum billl sarnp le, no t rhe whoie gUlllball pop ularion

H ow sal11pling works

W hcn sarnpling goes wrong

H ow to desi gn a sa m ple

De fine your sampling frame

Sometimes sam ples ca n be bia sed

So urces of bias

H ow to choose your sa mple

Simple ra ndo m sarnpling

How to choose a simple random sarnple

T here are other types of sampling

We ca n use stra rified sam pling,..

... 0 1' \\'e ca n use clusrcr sarn pling.. .

.. .0 1' even systematic silmpling

t\'lighty G urn bal ! has a sample

4 16

417

4 18

419

420

422

423

424

425

4·30

430

431

432

432

433

+33

439

Page 13: DawnGriffiths - gbv.de · geometrie, binomial and Poisson distributions. Popcorn machine xvi Drin'ks machine We necd to lind Chads probabiliry distribution There'sa patrern ro rhis

table of contents

xx

11est1mat1nh j9Ur p9pulQtJ9n

Making Predictions

Wouldn't it be great if you could tell what a population waslike, just by taking one sampie?Before you can claim full sam pie mastery, you need to know how to use your sampies

to best effect once you've collected them. This means using them to accurately predict

what the population will be like and coming up with a way of saying how reliable your

predictions are. In this chapter, we'll show you how knowing your sampie helps you

get to know your population, and vice versa .

So how long does flavor really last fo r? 442

Ler's sta rt by estirnating the popul ation mcan 44 3

Poin t estim a to rs ca n a pproximare po pul ation pa ra rnete rs 444

Ler's estinrare the popula tion variance 448

We nee d a different po in t estimaror tha n sa rnple va r ia nce 4'~9

W hieh formula 's whi eh? 451

It 's a quest ion of propo rt ion 454

So how do es this relate to sarnpling? 459

The sa m pling dist ribution o f prop ortion s 460

So wha r's the expe cta rion of Ps? 462

Ami whar's the va rianeo of P ? 46 3.'

Find th e d istribu tion of P 464

P follows a normal distribu tio n 465s

' Ne need pr obabilities fo r rhe sa rnple mean 4 71

The sarn pling distribut ion of the mean 472

Find the expccration for X 4 74

What about the th e va riance of X? 476

So how is X d ist ributed ? 480

If n is la rge, X can still be app roximared by the normal dist riburion 481

Using the cenrral limit theorem 482

Page 14: DawnGriffiths - gbv.de · geometrie, binomial and Poisson distributions. Popcorn machine xvi Drin'ks machine We necd to lind Chads probabiliry distribution There'sa patrern ro rhis

12

table 0' contents

c9nstructlng c9nt'ldenoo 'lntery'qJs

Guessing with Confidence

Sometimes sampIes don't give quite the right result.You've seen how you can use point estimators to estimate the precise value of the

population mean, variance, or proportion, but the trouble is, how can you be certain that

your estimate is completely accurate? After all, your assumptions about the population

rely on just one sampie , and what if your sample's off? In this chapter, you'lI see another

way of estimating population statistics, one that allows for uncertainty. Pick up your

probability tables, and we'll show you the ins and outs of confidence intervals.

M ight y Gu ruball is in tro uble

The pro blern with precision

l ntro ducing con fidence intcrvals

Four steps Ior lindin g confiden ce intervals

Srep 1: Ch oose your population statistic

Srep 2: Find its sampling distribution

Srep 3: Decide on th e level of co nfidence

Sre p +: Find the confidence limits

Sta rt by finding Z

Rewrit c the incquality in terms of m

fi nally, find the value of X

Youve found the confidence inrerval

Lets surn ma rize the sreps

Handy shortcuts for confidence inte rvals

Step I: Choose your po pu larion sta tistic

Srep 2: fi nd its samp ling distr ibution

Step 3: D ecide on the level of confide nce

Stcp 4: Find the confidence lirnits

T he t-distrib urion vs. the no rma l distributio n

488

489

490

491

492

492

494

+96

497

498

50 1

502

503

504

508

509

512

513

515

xxi

Page 15: DawnGriffiths - gbv.de · geometrie, binomial and Poisson distributions. Popcorn machine xvi Drin'ks machine We necd to lind Chads probabiliry distribution There'sa patrern ro rhis

table of contents

13uSlng hyp9theslStests

Look at the Evidence

Not everything you're told is absolutely certain.The trouble is, how do you know when what you're being told isn't right? Hypothesis

tests give you a way of using sampIes to test wheth er or not statistical claims are likely

to be true. They give you a way of weighing the evidence and testing whether extreme

results can be expla ined by mere co incidence, or whether there are darker forces at

work . Come with us on a ride through this chapter, and we'lI show you how you can use

hypothesis tests to confirm or allay your deepest suspicions.

xxii

Sra rsvilles new m iracle drug

Reso lving the conflict from 50,000 fec r

The six srcps for hypothesis testin g

St er I : D ecicle on the hyp oth csis

S rep 2: Choo se your test sta tistic

St er 3: D ere r m ine thc cri tica l region

St ep 4: Find the p-value

Srep 5: Is the sa rnp le result in the critical reg ion?

Step 6: M ake your c1 ec.isio n

W hat if the sa rnple size is la rger?

Le t's co ncluct a nother hypo th esis test

Stcp I : Decide on the hypothescs

St er) 2: Choose the rest sta tistic

U se the normal to ap proximate the binomial in our test sta tisric

St e r) 3: F ind rhe crit ica l region

Let's sta rr with T yp e I errors

What abour Type II er ro rs?

F incling e rr ors for Sn oreC ull

'Ne need ro lincl th e ra nge of val ues

Fincl P(T yp e II e rror)

Introclucing power

5'22

526

5'27

528

531

532

535

537

537

540

543

543

544

547

548

556

557

558

559

560

56l

Page 16: DawnGriffiths - gbv.de · geometrie, binomial and Poisson distributions. Popcorn machine xvi Drin'ks machine We necd to lind Chads probabiliry distribution There'sa patrern ro rhis

14

table of contents

the XZ dlstrlbut19n

There's Something Going On...

Sometimes things don't turn out quite the way you expect.When you model a situat ion using a particu.ar probability distr ibution , you have a

good idea of how things are likely to turn out lonq-terrn . But what happens if there are

differences between what you expect and what you get? How can you tell whether

your discrepancies come down to normal f1uctuations, or whether they're a sign of

an underlying problem with your probab ility model instead? In this chapter , we'll

show you how you can use the X2 distribution to analyze your results and sniff out

suspicious results.

T here may be trouble a head ar Fat Dan 's Cas ino

Let's srart wirh rhe slot rnachines

The '1/ tcst asscsses diflcrence

So whar cloes rhc test starisric represent?

Two ma in uses o f rhe '1/ d isrr iburion

V represen ts elegrees of Irccdorn

Whar's rhe significance?

H ypoth esis tesring with X~

You 've solveel the slot rnachine mystery

Fat Dan has anorher problern

T he X2 d isrributio n ca n tcst Ior ind epe ndence

You can finel rhe expccted Irequencies using probab ility

So what are rhe (rcquc ucics?

We still nced to calculare degrces of Ircedom

Gene ralizing the c1 egree s of Ireedom

A IlCI the lormula is...

You've savcd the casino

568

569

571

572

573

574

575

576

579

585

586

587

~)88

59l

596

597

599

xxiii

Page 17: DawnGriffiths - gbv.de · geometrie, binomial and Poisson distributions. Popcorn machine xvi Drin'ks machine We necd to lind Chads probabiliry distribution There'sa patrern ro rhis

table of contents

15correlarlcn emd regresS19n

What's My Line?

Have you ever wondered how two things are connected?So far we've looked at stausfies that tell you about just one variable-like rnen's height,

points scored by basketball players, or how long gumball flavor lasts-but there are othet

statistics that tell you about the connection between variables. Seeing how things are

connected can give you a lot of information about the real world, informat ion that you can

use to your advantage . Stay with us while we show you the key to spotting connectio

correlation and regression .

Let's analyze sunshine and atrendanc e 60 7

Exploring tw es of data 608

Visualizing bivar iare da ra 609

Scatter d iagr ams show you patre rns 6 12

C orrelarion vs. ca usano n 614

Predict values wirh a line of besr fit 618

Your bes t gue ss is still a guess 6 19

We need to rninirnize the errors 620

Introdu cing the surn of sq uared err ors 62 1

Find rhe equation for the line of best fit 622

Fincling the slope for the line of best fit 623

Fincling the slope for rhe line of besr fit, coniinucd 624

Wc've fou nd b, bur whar abo ut a? 625

You've m ad e the connection 62 9

Let's look ar some co rrela rions 630

T he correlation coel1icien t measures how weil the line fits rhe data 63 1

T here's a formu la for calcularing the corre lation coeflicicnr, r 632

Find r for the concert dara 633

Find r for the co neert daia , con tinued 634

xxiv

Fee! that funkyrh ythm . baby.

Page 18: DawnGriffiths - gbv.de · geometrie, binomial and Poisson distributions. Popcorn machine xvi Drin'ks machine We necd to lind Chads probabiliry distribution There'sa patrern ro rhis

•1

••11

table ot contents

lett9v'ers

The Top Ten Things (we didn't cover)

Even after all that, there's a bit more. The re are just a few more

things we think you need to know. We wouldn 't feel right about ignoring them ,

even though they only need abrief mention . So before you put the book down,

take a read through these short but important statistics tidbits .

# I. O rher ways of prese n ring dara 644

# 2. Distr ibution ana to rny 645

# 3. Experiments 646

#4. Least sq ua re reg ression alterna te notation 648

#5 . The coeffi cient of determin arion 649

# 6. Non-linear relati on ships 650

#7 . The co nfide nce inrerva l Ior the slope of a regression line 65 1

#8 . Sa mpling disrributions - the differen ce between two rnean s 652

# 9. Sa mpling distriburions - the d iflerence be tween two proport ion s 653

# 10 . E(X) and Var(X) for continuous probability d istribu tions 654

stattstics telbles

Looking Things up

Where would you be without your trusty probability tables?Understanding your probabil ity distributions isn't quite enough . For some of them , you

need to be able to look up your probabilities in stand ard probabil ity tables. In this

appendix you 'll find tables for the normal , t and X2 distributions so you can look up

probabilities to your heart's content.

St andard normal prob abi lities

t-distriburion cr itical value s

x' cri tica l values

658

660

66 l

xxv