chapter 2 descriptive statistics - purdue universityzhanghao/uibe12/lectures/lecture02.pdfaddison...

63
Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman 1 Chapter 2 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data 2-3 Pictures of Data 2-4 Measures of Central Tendency 2-5 Measures of Variation 2-6 Measures of Position 2-7 Exploratory Data Analysis Review and Projects

Upload: others

Post on 17-Jan-2020

15 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

1

Chapter 2

Descriptive Statistics

2-1 Overview

2-2 Summarizing Data

2-3 Pictures of Data

2-4 Measures of Central Tendency

2-5 Measures of Variation

2-6 Measures of Position

2-7 Exploratory Data Analysis

Review and Projects

Page 2: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

2

Descriptive Statistics

summarizes or describes the important

characteristics of a known set of

population data

Inferential Statistics

uses sample data to make inferences

about a population

Overview 2-1

Page 3: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

3

1. Nature or shape of the distribution, such as bell-shaped, uniform, or skewed

2. Representative score, such as an average

3. Measure of scattering or variation

Important Characteristics

of Data

Page 4: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

4

Summarizing Data With

Frequency Tables

Frequency Table

lists categories (or classes) of scores,

along with counts (or frequencies) of the

number of scores that fall into each

category

2-2

Page 5: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

5

Axial Loads of 0.0109 in. Cans

270

278

250

278

290

274

242

269

257

272

265

263

234

270

273

270

277

294

279

268

230

268

278

268

262

Table 2-1

273

201

275

260

286

272

284

282

278

268

263

273

282

285

289

268

208

292

275

279

276

242

285

273

268

258

264

281

262

278

265

241

267

295

283

281

209

276

273

263

218

271

289

223

217

225

283

292

270

262

204

265

271

273

283

275

276

282

270

256

268

259

272

269

270

251

208

290

220

259

282

277

282

256

293

254

223

263

274

262

263

200

272

268

206

280

287

257

284

279

252

280

215

281

291

276

285

287

297

290

228

274

277

286

277

251

278

277

286

277

289

269

267

276

206

284

269

284

268

291

289

293

277

280

274

282

230

275

236

295

289

283

261

262

252

283

277

204

286

270

278

270

283

272

281

288

248

266

256

292

Page 6: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

6

Frequency Table of Axial

Loads of Aluminum Cans

200 - 209

210 - 219

220 - 229

230 - 239

240 - 249

250 - 259

260 - 269

270 - 279

280 - 289

290 - 299

Table 2-2

Axial Load Frequency

9

3

5

4

4

14

32

52

38

14

Page 7: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

7

Frequency Table Definitions

• Class: An interval.

• Lower Class Limit: The left endpoint of a class.

• Upper Class Limit: The upper endpoint of a class.

• Class Mark: The midpoint of the class.

• Class width: the difference between the two

consecutive lower class limits.

Page 8: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

8

200 - 209

210 - 219

220 - 229

230 - 239

240 - 249

250 - 259

260 - 269

270 - 279

280 - 289

290 - 299

Score Frequency

9

3

5

4

4

14

32

52

38

14

Table 2-2

Definition values for the example

Lower Class Limits: 200, 210, …

Upper class limits: 209,219 …

Class Marks: 204.5=(200+209)/2,, 214.5, …

Class width: 210-200=10.

Page 9: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

9

Determine the Definition Values

for this Frequency Table

Quiz Scores

Frequency

0 - 4

5 - 9

10 - 14

15 - 19

20 - 24

2

5

8

11

7

Classes

Lower Class Limits

Upper Class Limits

Class Marks

Class Width

Page 10: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

10

•3. Select for the first lower limit either the lowest score or a

convenient value slightly less than the lowest score.

•4. Add the class width to the starting point to get the second lower

class limit.

•5. List the lower class limits in a vertical column and enter the

upper class limits.

•6. Represent each score by a tally mark in the appropriate class.

Total tally marks to find the total frequency for each class.

Constructing A Frequency Table

• 1. Decide on the number of classes.

• 2. Determine the class width by dividing the range by the number

of classes (range = highest score – lowest score) and round up.

class width = round up of range

number of classes

Page 11: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

11

1. Classes should be mutually exclusive.

2. Include all classes, even if the frequency is zero.

3. Try to use the same width for all classes.

4. Select convenient numbers for class limits.

5. Use between 5 and 20 classes.

6. The sum of the class frequencies must equal the number of original data values.

Guidelines For Frequency

Tables

Page 12: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

12

Relative Frequency Table

relative frequency = class frequency

sum of all frequencies

Page 13: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

13

Relative Frequency Table

200 - 209

210 - 219

220 - 229

230 - 239

240 - 249

250 - 259

260 - 269

270 - 279

280 - 289

290 - 299

Score Frequency

9

3

5

4

4

14

32

52

38

14

Table 2-2

200 - 209

210 - 219

220 - 229

230 - 239

240 - 249

250 - 259

260 - 269

270 - 279

280 - 289

290 - 299

Axial Load

Relative Frequency

0.051

0.017

0.029

0.023

0.023

0.080

0.183

0.297

0.217

0.080

-

Table 2-3

9 175

3 175

5 175

= .051

= .017

= .029

Page 14: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

14

Cumulative Frequency Table

200 - 209

210 - 219

220 - 229

230 - 239

240 - 249

250 - 259

260 - 269

270 - 279

280 - 289

290 - 299

Score Frequency

9

3

5

4

4

14

32

52

38

14

Table 2-2

Less than 210

Less than 220

Less than 230

Less than 240

Less than 250

Less than 260

Less than 270

Less than 280

Less than 290

Less than 300

Axial Load

Cumulative Frequency

9

12

17

21

25

39

71

123

161

175

Table 2-4

Cumulative

Frequencies

Page 15: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

15

Frequency Tables

200 - 209

210 - 219

220 - 229

230 - 239

240 - 249

250 - 259

260 - 269

270 - 279

280 - 289

290 - 299

Score Frequency

9

3

5

4

4

14

32

52

38

14

Table 2-2

200 - 209

210 - 219

220 - 229

230 - 239

240 - 249

250 - 259

260 - 269

270 - 279

280 - 289

290 - 299

Axial Load

Relative Frequency

0.051

0.017

0.029

0.023

0.023

0.080

0.183

0.297

0.217

0.08-

Table 2-3

Axial Load

Cumulative Frequency

9

12

17

21

25

39

71

123

161

175

Table 2-4

Less than 210

Less than 220

Less than 230

Less than 240

Less than 250

Less than 260

Less than 270

Less than 280

Less than 290

Less than 300

Page 16: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

16

Mean

FIGURE 2-7

Mean as a Balance Point

Page 17: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

17

µ is pronounced ‘mu’ and denotes the mean of all values

Notation

x is pronounced ‘x-bar’ and denotes the mean of a set of

sample values

S denotes the summation of a set of values

x is the variable usually used to represent the individual data values

n represents the number of data values in a sample

N represents the number of data values in a population

in a population

Page 18: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

18

Calculators can calculate the mean of data

Definitions

Mean the value obtained by adding the scores and

dividing the total by the number of scores

n x =

S x Sample

N µ =

S x Population

Page 19: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

19

Definitions

Median

the middle value when scores are arranged in (ascending or descending) order

often denoted by x (pronounced ‘x-tilde’)

is not affected by an extreme value

~

Page 20: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

20

• 1 1 3 3 4 5 5 5 5 5 no exact middle -- shared by two numbers

MEDIAN is 4.5

4 + 5

2 = 4.5

• 5 5 5 3 1 5 1 4 3 5 2 • 1 1 2 3 3 4 5 5 5 5 5 (in order)

exact middle MEDIAN is 4

Page 21: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

21

Definitions

Mode

the score that occurs most frequently

Bimodal

Multimodal

No Mode

the only measure of central tendency that can be used with nominal data

Page 22: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

22

Examples

• Mode is 5

• Bimodal

• No Mode

a. 5 5 5 3 1 5 1 4 3 5

b. 2 2 2 3 4 5 6 6 6 7 9

c. 2 3 6 7 8 9 10

Page 23: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

23

Examples

• Mode is 5

• Bimodal

• No Mode

a. 5 5 5 3 1 5 1 4 3 5

b. 2 2 2 3 4 5 6 6 6 7 9

c. 2 3 6 7 8 9 10

d. 2 2 3 3 3 4

e. 2 2 3 3 4 4 5 5

• Mode is 3

• No Mode

Page 24: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

24

Midrange

the value halfway between the highest and lowest scores

Definitions

Midrange = highest score + lowest score

2

Page 25: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

25

Carry one more decimal place than is present in the orignal set of data

Round-off rule for

measures of central tendency

Page 26: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

26

An Example of Skewness

76543

3

2

1

0

C1

Fre

qu

en

cy

Dataset 1: 3, 4, 4, 5, 5, 5, 6, 6, 7

Mean = 5, Median = 5

Dataset 2: 3, 4, 4, 5, 5, 5, 7, 7 ,9.

Mean=5.444, Median = 5.

Dataset 3: 2, 3, 3, 5, 5, 5, 6, 6, 7.

Mean = 4.667, Median = 5.

Symmetric

Skewed right

Skewed left

765432

3

2

1

0

C3

Fre

qu

en

cy

9876543

3

2

1

0

C2

Fre

qu

en

cy

Page 27: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

27

Skewness

Mode = Mean = Median

SKEWED LEFT

(negatively)

SYMMETRIC

Mean Mode

Median

SKEWED RIGHT

(positively)

Mean Mode

Median

Figure 2-8 (b)

Figure 2-8 (a)

Figure 2-8 (c)

Page 28: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

28

• Advantages - Disadvantages

Best Measure

of Central Tendency

Table 2-6

Page 29: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

29

use class mark of classes for variable x

Mean from a Frequency Table

x = Formula 2-2 f

S (f • x)

S

x = class mark

f = frequency

S f = n

Page 30: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

30

0 - 4

5 - 9

10 - 14

15 - 19

20 - 24

2

5

8

11

7

Quiz Scores

Frequency Class Marks

2

7

12

17

22

Mean of this

frequency table =14.4

Page 31: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

31

Waiting Times of Bank Customers

at Different Banks

in minutes

Jefferson Valley Bank

Bank of Providence

6.5

4.2

6.6

5.4

6.7

5.8

6.8

6.2

7.1

6.7

7.3

7.7

7.4

7.7

7.7

8.5

7.7

9.3

7.7

10.0

Jefferson Valley

Bank

7.15

7.20

7.7

7.10

Bank of Providence

7.15

7.20

7.7

7.10

Mean

Median

Mode

Midrang

e

Page 32: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

32

Measure of Variation

Range

score

highest lowest

score

Page 33: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

33

a measure of variation of the scores

about the mean

(average deviation from the mean)

Measure of Variation

Standard Deviation

Page 34: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

34

Sample Standard Deviation

Formula

Formula 2 -4

calculators can calculate sample standard

deviation of data

S (x – x)2

n – 1 S =

Page 35: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

35

Find the standard deviation of the sample data: 2, 3, 4, 5, 5, 5. S2 = 8/5=1.6, S=1.26. Use the shortcut formula to find the standard deviations of the above data, and the waiting times at the two banks. 1) S x2

=104,

2) Jefferson Valley Bank: S x2 =513.27, S x

=71.5, s=0.48.

3) Bank of Providence: S x2 =541.09, S x =71.5, s=1.82.

Page 36: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

36

Population Standard Deviation

calculators can calculate the

population standard deviation

of data

S (x – µ)

N

2

s =

Page 37: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

37

Symbols

for Standard Deviation

Sample Population

s

s x

xsn

s

Sx

xsn–1

Book

Some graphics calculators Some nongraphics calculators

Textbook

Some graphics calculators

Some nongraphics

calculators

Page 38: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

38

Measure of Variation

Variance

standard deviation squared

s

s

2

2

}

use square key on calculator

Notation

Page 39: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

39

S (x – x)2

n – 1 s

2 =

S (x – µ)2

N s2

=

Sample

Variance

Population

Variance

Variance

Page 40: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

40

Round-off Rule

for measures of variation

Carry one more decimal place than was present in the original data

Page 41: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

41

Standard Deviation

Shortcut Formula

Formula 2 - 6

n (n – 1) s

= n (S x2) – (S x)2

Page 42: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

42

1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7

1

2

3

4

5

6

7

s = 0

s = 0.8 s = 1.0 s = 3.0

Standard deviation gets larger as spread of data increases.

Same Means (x = 4)

Different Standard Deviations

FIGURE 2-10

Fre

qu

en

cy

Page 43: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

43

x – s x x + s

68% within 1 standard deviation

0.340 0.340

The Empirical Rule

(applies to bell shaped distributions)

FIGURE 2-10

Page 44: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

44

x – 2s x – s x x + 2s x + s

68% within 1 standard deviation

0.340 0.340

95% within 2 standard deviations

The Empirical Rule

(applies to bell shaped distributions)

0.135 0.135

FIGURE 2-10

Page 45: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

45

x – 3s x – 2s x – s x x + 2s x + 3s x + s

68% within 1 standard deviation

0.340 0.340

95% within 2 standard deviations

99.7% of data are within 3 standard deviations of the mean

The Empirical Rule

(applies to bell shaped distributions)

0.001 0.001

0.024 0.024

0.135 0.135

FIGURE 2-10

Page 46: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

46

Range Rule of Thumb

x – 2s

x x + 2s

Range 4s

or

s Range

4

(minimum) (maximum)

Page 47: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

47

Chebyshev’s Theorem

applies to distributions of any shape

the proportion (or fraction) of any set of

data lying within k standard deviations of

the mean is always at least 1 – 1/k2, where

k is any positive number greater than 1.

Page 48: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

48

Measures of Variation

Summary

• For typical data sets, it is unusual for a

score to differ from the mean by more than

2 or 3 standard deviations.

Page 49: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

49

An application of measure of variation

There are two brands, A, B or car tires. Both have a mean life time of 60,000 miles, but brand A has a standard deviation on lifetime of 1000 miles and Brand B has a standard deviation on lifetime of 3000 miles. Which brand would you prefer?

Page 50: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

50

Q1, Q2, Q3 divides ranked scores into four equal parts

Quartiles

25% 25% 25% 25%

Q3 Q2 Q1

Page 51: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

51

• 99 Percentiles

Percentiles

Page 52: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

52

Finding the Percentile of a Given Score

Percentile of score x = • 100 number of scores less than x

total number of scores

[1] 200 201 204 204 206 206 208 208 209 215 217 218 220 223 223

[16] 225 228 230 230 234 236 241 242 242 248 250 251 251 252 252

[31] 254 256 256 256 257 257 258 259 259 260 261 262 262 262 262

[46] 262 263 263 263 263 263 264 265 265 265 266 267 267 268 268

[61] 268 268 268 268 268 268 268 269 269 269 269 270 270 270 270

[76] 270 270 270 270 271 271 272 272 272 272 272 273 273 273 273

[91] 273 273 274 274 274 274 275 275 275 275 276 276 276 276 276

[106] 277 277 277 277 277 277 277 277 278 278 278 278 278 278 278

[121] 279 279 279 280 280 280 281 281 281 281 282 282 282 282 282

[136] 282 283 283 283 283 283 283 284 284 284 284 285 285 285 286

[151] 286 286 286 287 287 288 289 289 289 289 289 290 290 290 291

[166] 291 292 292 292 293 293 294 295 295 297

Sorted Axial Loads of 175 Aluminum Cans

Page 53: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

53

Rank the data.

(Arrange the data in

order of lowest to

highest.)

Finding the Value of the

kth Percentile

The value of the kth percentile

is midway between the Lth score

and the highest score in the

original set of data. Find Pk by

adding the L th score and the

next higher score and dividing the

total by 2.

Start

Compute

L = n where

n = number of scores

k = percentile in question

) ( k

100

Change L by rounding

it up to the next

larger whole number.

The value of Pk is the

Lth score, counting

from the lowest

Is L a whole

number ?

Yes

No

Page 54: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

54

[1] 200 201 204 204 206 206 208 208 209 215 217 218 220 223 223

[16] 225 228 230 230 234 236 241 242 242 248 250 251 251 252 252

[31] 254 256 256 256 257 257 258 259 259 260 261 262 262 262 262

[46] 262 263 263 263 263 263 264 265 265 265 266 267 267 268 268

[61] 268 268 268 268 268 268 268 269 269 269 269 270 270 270 270

[76] 270 270 270 270 271 271 272 272 272 272 272 273 273 273 273

[91] 273 273 274 274 274 274 275 275 275 275 276 276 276 276 276

[106] 277 277 277 277 277 277 277 277 278 278 278 278 278 278 278

[121] 279 279 279 280 280 280 281 281 281 281 282 282 282 282 282

[136] 282 283 283 283 283 283 283 284 284 284 284 285 285 285 286

[151] 286 286 286 287 287 288 289 289 289 289 289 290 290 290 291

[166] 291 292 292 292 293 293 294 295 295 297

The 10th percentile: L=175*10/100=17.5, round up to 18. So the 10th

percentile is the 18th one in the sorted data, i.e., 230.

The 25th percentile: L=175*25/100=43.52, rounded up to 44. The 25th

percentile is the 44th one in the sorted data, I.ei. 262.

Sorted Axial Loads of 175 Aluminum Cans

Page 55: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

55

Interquartile Range: Q3 – Q1

Semi-interquartile Range:

Midquartile:

2

2

Q3 – Q1

Q1 + Q3

Page 56: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

56

Exploratory Data Analysis

Used to explore data at a

preliminary level

Few or no assumptions are made

about the data

Tends to evolve relatively simple

calculations and graphs

Page 57: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

57

Used to confirm final conclusions

about data

Typically requires some very

important assumptions about the

data

Calculations are often complex, and

graphs are often unnecessary

Exploratory Data Analysis

Used to explore data at a

preliminary level

Few or no assumptions are made

about the data

Tends to evolve relatively simple

calculations and graphs

Traditional Statistics

Page 58: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

58

Boxplots

Box-and-Whisker Diagram

5 - number summary

Minimum

first quartile Q1

Median

third quartile Q3

Maximum

Page 59: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

59

Boxplots Box-and-Whisker Diagram

Figure 2-13 Boxplot of Pulse Rates (Beats per minute) of Smokers

52

60 68.5 78

90

Page 60: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

60

Normal Skewed

Figure 2-14 Boxplots

Uniform

Page 61: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

61

Outliers

Values that are very far away from most of the data

300

290

280

270

260

250

240

230

220

210

200

Axia

l L

oa

d

Page 62: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

62

Class Survey Data

yn

75

70

65

60

Bone

He

igh

t

Boxplots for the heights of those who never broke a bone and those who did

Page 63: Chapter 2 Descriptive Statistics - Purdue Universityzhanghao/UIBE12/Lectures/Lecture02.pdfAddison Wesley Longman 5 Axial Loads of 0.0109 in. Cans 270 278 250 278 290 274 242 269 257

Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman

63

When comparing two or more boxplots, it is necessary to use the same scale.

40

50

60

70

80

90

100

PU

LS

E

(yes) SMOKE (No)

1 2