1_introduction to statistics_jan-2, 2012 [compatibility mode]

Post on 12-Nov-2014

11 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

DESCRIPTION

good

TRANSCRIPT

Page 1

1

Statistics for Management

Ramesh Anbanandamramesh@nitc.ac.in

Department of Mechanical Engineering, NIT CalicutKerala, India -673 601.

Dedicated to

Professor. S. G. DeshmukhProfessor. S. G. DeshmukhProfessor. S. G. DeshmukhProfessor. S. G. Deshmukh

2ramesh.anbanandam@gmail.com

3

Objectives of this course…

• Appreciate the role of statistics in various decision making situations

• Summarize data with frequency distributions and graphic presentation.

• Interpret descriptive statistics for central tendency, dispersion and location

• Define and interpret probability. Utilize discrete and continuous probability distributions to determine probabilities in various managerial applications.

• Apply the central limit theorem to determine probabilities of sample means and compute and interpret point and interval estimates.

• Conduct Hypothesis tests for means

• Utilize linear regression to estimate and predict variables.• Understand basic concepts of design-of-experiment

• Understand importance of non-parametric tests

ramesh.anbanandam@gmail.com

Lab/tutorial

• The laboratory content will require pre-requisite of working with Excel. There will

be quizzes/assignments every week. The

lab assignments are to be submitted on

that day itself. Students will be also

required to visit and consult useful web

resources.

ramesh.anbanandam@gmail.com 4

Mode of Evaluation and Grades

• Grades are based on total points earned from test 1 &2,lab/tutorial/assignments,

mini-project and end semester

examination.

ramesh.anbanandam@gmail.com 5

Test 1 Test 2 End Semester

Lab/tutorial /assignments

(every week)

Mini-Project

Surprise quizzes

15 % 15 % 40 % 10 % 10% 10%

Reference

• Meyer PL, Introductory Probability and Statistical Applications, Oxford and IBH Publishers

• Miller IR, Freund JE, Johnson R, Probability and Statistics for Engineers, Prentice-Hall (I) Ltd

• Walpole RE and Myers RH, Probability & Statistics for Engineers and Scientists, Macmillan

• Levin, R. I. and Rubin, D.S., Statistics for Management

(Pearson Education )

• Levine,David., Stephan,David., Krehbiel, Timothy and

Berenson, Mark., Statistics for Managers using Microsoft Excel, Prentice Hall

ramesh.anbanandam@gmail.com 6

Page 2

7

Statistics..

• Plays an important role in many facets of human endeavour

• Occurs remarkably frequently in our

everyday lives

• It is often incorrectly thought of as just a

collection of data, graphs and diagrams

ramesh.anbanandam@gmail.com 8

Statistics in Business

• Accounting — auditing and cost estimation• Economics — regional, national, and international

economic performance • Finance — investments and portfolio management• Management — human resources, compensation,

and quality management• Management Information Systems — (ERP):

performance of systems which gather, summarize, and disseminate information to various managerial levels

• Marketing — market analysis and consumer research

• International Business — market and demographic analysis

ramesh.anbanandam@gmail.com

9

What is Statistics?

• Science of gathering, analyzing, interpreting,

and presenting data

• Branch of mathematics

• Facts and figures

• Measurement taken on a sample

Statistics is the scientific method that

enables us to make decisions as responsibly

as possible.

ramesh.anbanandam@gmail.com 10

Statistics…

• The science of data to answer research questions– Formulate a research question(s) (hypothesis)

– Collect data

– Analyze and summarize data

– Draw conclusions to answer research questions

• Statistical Inference

– In the presence of variation

ramesh.anbanandam@gmail.com

11

Answers Questions from Everyday

Life• Business: Will a new marketing strategy be

profitable?

• Industry: Will a product’s life exceed the warranty period?

• Medicine: Will this year’s flu vaccine reduce the chance of flu?

• Education: Will technology improve learning?

• Government: Will a change in interest rates affect inflation?

ramesh.anbanandam@gmail.com 12

Statistics: Science of

variability..?

• Virtually everything varies

• Variation occurs among individuals

• Variation occurs within any one individual

as time passes

ramesh.anbanandam@gmail.com

Page 3

13

Can Statistics Be Trusted?“There are three kinds of lies:

Lies, damned lies, and statistics.”--Mark Twain

“It is easy to lie with statistics. But it is

easier to lie without them.” --Frederick Mosteller

“Figures won’t lie but liars will figure.”--Charles Grosvenor

ramesh.anbanandam@gmail.com 14

Population Versus Sample• Population — the whole

– a collection of persons, objects, or items under study

– The entire group of individuals in a statistical study we want information about.

• Census — gathering data from the entire population

• Sample — a portion of the whole– a subset of the population

– a part of the population from which we actually collect information, used to draw conclusions about the whole (statistical inference

ramesh.anbanandam@gmail.com

15

Statistics can be split into two

broad categories

1. Descriptive statistics

2. Statistical inference

ramesh.anbanandam@gmail.com

Descriptive Statistics

� Collect data

� ex. Survey

� Present data

� ex. Tables and graphs

� Characterize data

� ex. Sample mean =i

X

n

17

Descriptive statistics..

• Encompasses the following:

– Graphical or pictorial display

– Condensation of large masses of data into a

form such as tables

– Preparation of summary measures to give a

concise description of complex information

(e.g. an average figure)

– Exhibition of patterns that may be found in

sets of information

ramesh.anbanandam@gmail.com

Inferential Statistics

� Estimation

� ex. Estimate the

population mean weight

using the sample mean

weight

� Hypothesis testing

� ex. Test the claim that the

population mean weight

is 120 poundsDrawing conclusions and/or making decisions concerning a population based on sample results.

Page 4

19

Inferential Statistics..

• Especially relates to:

– Determining whether characteristics of a

situation are unusual or if they have

happened by chance

– Estimating values of numerical quantities and

determining the reliability of those estimates

– Using past occurrences to attempt to predict

the future

ramesh.anbanandam@gmail.com 20

Process of Inferential Statistics

Population

(parameter)

µ

Sample

x

(statistic )

Calculate x

to estimate µ

Select a

random sample

ramesh.anbanandam@gmail.com

Population vs. Sample

Population Sample

Measures used to describe the

population are called parameters

Measures computed from

sample data are called statistics

22

Parameter vs. Statistic

• Parameter — descriptive measure of the

population

– Usually represented by Greek letters

• Statistic — descriptive measure of a

sample

– Usually represented by Roman letters

ramesh.anbanandam@gmail.com

23

Symbols for Population

Parameters

µ denotes population parameter

2

σ denotes population variance

σ denotes population standard deviation

ramesh.anbanandam@gmail.com 24

Symbols for Sample Statistics

x denotes sample mean

2

S denotes sample variance

S denotes sample standard deviation

ramesh.anbanandam@gmail.com

Page 5

Types of Variables

� Categorical (qualitative) variables have values

that can only be placed into categories, such as

“yes” and “no.”

� Numerical (quantitative) variables have values

that represent quantities.

Types of Variables

Data

Categorical Numerical

Discrete Continuous

Examples:

� Marital Status

� Political Party� Eye Color

(Defined categories)Examples:

� Number of Children

� Defects per hour

(Counted items)

Examples:

� Weight

� Voltage

(Measured characteristics)

27

Levels of Data Measurement

• Nominal — Lowest level of measurement

• Ordinal

• Interval

• Ratio — Highest level of measurement

ramesh.anbanandam@gmail.com

Levels of Measurement

� A nominal scale classifies data into distinct

categories in which no ranking is implied.

Categorical Variables Categories

Personal Computer Ownership

Type of Stocks Owned

Internet Provider

Yes / No

Microsoft Network / AOL

Growth Value Other

Levels of Measurement

� An ordinal scale classifies data into distinct

categories in which ranking is implied

Categorical Variable Ordered Categories

Student class designation Freshman, Sophomore, Junior,

Senior

Product satisfaction Satisfied, Neutral, Unsatisfied

Faculty rank Professor, Associate Professor,

Assistant Professor, Instructor

Standard & Poor’s bond ratings AAA, AA, A, BBB, BB, B, CCC, CC,

C, DDD, DD, D

Student Grades A, B, C, D, F

Levels of Measurement

� An interval scale is an ordered scale in which the difference between measurements is a meaningful quantity but the measurements do not have a true zero point.

� A ratio scale is an ordered scale in which the difference between the measurements is a meaningful quantity and the measurements have a

true zero point.

Page 6

Interval and Ratio Scales

32

Usage Potential of Various

Levels of Data

Nominal

Ordinal

Interval

Ratio

ramesh.anbanandam@gmail.com

33

Data Level, Operations,

and Statistical Methods

Data Level

Nominal

Ordinal

Interval

Ratio

Meaningful Operations

Classifying and Counting

All of the above plus Ranking

All of the above plus Addition, Subtraction

All of the above plus multiplication and division

StatisticalMethods

Nonparametric

Nonparametric

Parametric

Parametric

ramesh.anbanandam@gmail.com 34

Data preparation rules

• Data presented must be

– factual

– relevant

Before presentation always check:

• the source of the data

• that the data has been accurately

transcribed

• the figures are relevant to the problem

ramesh.anbanandam@gmail.com

35

Methods of visual presentation

of data• Table

1st Qtr 2nd Qtr 3rd Qtr 4th Qtr

East 20.4 27.4 90 20.4

West 30.6 38.6 34.6 31.6

North 45.9 46.9 45 43.9

ramesh.anbanandam@gmail.com 36

Methods of visual presentation

of data• Graphs

0

10

20

30

40

50

60

70

80

90

1st Qtr 2nd Qtr 3rd Qtr 4th Qtr

East

West

North

ramesh.anbanandam@gmail.com

Page 7

37

Methods of visual presentation

of data• Pie chart

1st Qtr

2nd Qtr

3rd Qtr

4th Qtr

ramesh.anbanandam@gmail.com 38

Methods of visual presentation

of data• Multiple bar chart

0 20 40 60 80 100

1st Qtr

2nd Qtr

3rd Qtr

4th Qtr

North

West

East

ramesh.anbanandam@gmail.com

39

Methods of visual presentation

of data• Simple pictogram

0

20

40

60

80

100

1st Qtr 2nd Qtr 3rd Qtr 4th Qtr

East

North

West

ramesh.anbanandam@gmail.com 40

Frequency distributions

• Frequency tables

Class Interval Frequency Cumulative Frequency

< 20 13 13

<40 18 31

<60 25 56

<80 15 71

<100 9 80

Observation Table

ramesh.anbanandam@gmail.com

41

Frequency

0

5

10

15

20

25

30

< 20 <40 <60 <80 <100

Frequency

Frequency diagramsFrequency

0

5

10

15

20

25

30

< 20 <40 <60 <80 <100

Frequency

Cumulative Frequency

0

10

20

30

40

50

60

70

80

90

< 20 <40 <60 <80 <100

Cumulative Frequency

ramesh.anbanandam@gmail.com 42

Ungrouped Versus

Grouped Data

• Ungrouped data

• have not been summarized in any way

• are also called raw data

• Grouped data

• have been organized into a frequency distribution

ramesh.anbanandam@gmail.com

Page 8

43

Example of Ungrouped

Data

42

30

53

50

52

30

55

49

61

74

26

58

40

40

28

36

30

33

31

37

32

37

30

32

23

32

58

43

30

29

34

50

47

31

35

26

64

46

40

43

57

30

49

40

25

50

52

32

60

54

Ages of a Sample of

Managers from

XYZ

ramesh.anbanandam@gmail.com 44

Frequency Distribution of

Ages

Class Interval Frequency

20-under 30 6

30-under 40 18

40-under 50 11

50-under 60 11

60-under 70 3

70-under 80 1

ramesh.anbanandam@gmail.com

45

Data Range

42

30

53

50

52

30

55

49

61

74

26

58

40

40

28

36

30

33

31

37

32

37

30

32

23

32

58

43

30

29

34

50

47

31

35

26

64

46

40

43

57

30

49

40

25

50

52

32

60

54

Smallest

Largest

Range = Largest - Smallest

= 74 - 23

= 51

ramesh.anbanandam@gmail.com 46

Number of Classes and Class

Width• The number of classes should be between 5 and 15.

• Fewer than 5 classes cause excessive summarization.

• More than 15 classes leave too much detail.

• Class Width

• Divide the range by the number of classes for an approximate class width

• Round up to a convenient number

10 = Width Class

8.5 =6

51 = Width Class eApproximat

ramesh.anbanandam@gmail.com

47

Class Midpoint

Class Midpoint = beginning class endpoint + ending class endpoint

2

= 30 + 40

2

= 35

( )

Class Midpoint = class beginning point + 1

2class width

= 30 + 1

210

= 35

ramesh.anbanandam@gmail.com 48

Relative FrequencyRelative

Class Interval Frequency Frequency

20-under 30 6 .12

30-under 40 18 .36

40-under 50 11 .22

50-under 60 11 .22

60-under 70 3 .06

70-under 80 1 .02

Total 50 1.00

6

50=

18

50=

ramesh.anbanandam@gmail.com

Page 9

49

Cumulative Frequency

CumulativeClass Interval Frequency Frequency

20-under 30 6 6

30-under 40 18 24

40-under 50 11 35

50-under 60 11 46

60-under 70 3 49

70-under 80 1 50

Total 50

18 + 6

11 + 24

ramesh.anbanandam@gmail.com 50

Class Midpoints, Relative Frequencies,

and Cumulative Frequencies

Relative Cumulative

Class IntervalFrequency Midpoint Frequency Frequency

20-under 30 6 25 .12 6

30-under 40 18 35 .36 24

40-under 50 11 45 .22 35

50-under 60 11 55 .22 46

60-under 70 3 65 .06 49

70-under 80 1 75 .02 50

Total 50 1.00ramesh.anbanandam@gmail.com

51

Cumulative Relative Frequencies

Relative Cumulative Cumulative Relative

Class IntervalFrequency Frequency Frequency Frequency

20-under 30 6 .12 6 .12

30-under 40 18 .36 24 .48

40-under 50 11 .22 35 .70

50-under 60 11 .22 46 .92

60-under 70 3 .06 49 .98

70-under 80 1 .02 50 1.00

Total 50 1.00

ramesh.anbanandam@gmail.com 52

Common Statistical Graphs

• Histogram -- vertical bar chart of frequencies

• Frequency Polygon -- line graph of frequencies

• Ogive -- line graph of cumulative frequencies

• Pie Chart -- proportional representation for

categories of a whole

• Stem and Leaf Plot

• Pareto Chart

• Scatter Plot

ramesh.anbanandam@gmail.com

53

Histogram

Class Interval Frequency

20-under 30 6

30-under 40 18

40-under 50 11

50-under 60 11

60-under 70 3

70-under 80 1 01

02

0

0 10 20 30 40 50 60 70 80

Years

Fre

qu

en

cy

ramesh.anbanandam@gmail.com 54

Histogram Construction

Class Interval Frequency

20-under 30 6

30-under 40 18

40-under 50 11

50-under 60 11

60-under 70 3

70-under 80 1

01

02

0

0 10 20 30 40 50 60 70 80

Years

Fre

qu

en

cy

ramesh.anbanandam@gmail.com

Page 10

55

Frequency Polygon

Class Interval Frequency

20-under 30 6

30-under 40 18

40-under 50 11

50-under 60 11

60-under 70 3

70-under 80 1 01

02

0

0 10 20 30 40 50 60 70 80

Years

Fre

qu

en

cy

ramesh.anbanandam@gmail.com 56

Ogive

Cumulative

Class Interval Frequency

20-under 30 6

30-under 40 24

40-under 50 35

50-under 60 46

60-under 70 49

70-under 80 50

020

40

60

0 10 20 30 40 50 60 70 80

Years

Fre

qu

en

cy

ramesh.anbanandam@gmail.com

57

Relative Frequency Ogive

Cumulative

Relative

Class Interval Frequency

20-under 30 .12

30-under 40 .48

40-under 50 .70

50-under 60 .92

60-under 70 .98

70-under 80 1.00

0.000.100.200.300.400.500.600.700.800.901.00

0 10 20 30 40 50 60 70 80

Years

Cu

mu

lati

ve R

ela

tive F

req

uen

cy

ramesh.anbanandam@gmail.com 58

Complaints by Passengers

COMPLAINT NUMBER PROPORTION DEGREES

Stations, etc. 28,000 .40 144.0

TrainPerformance

14,700 .21 75.6

Equipment 10,500 .15 50.4

Personnel 9,800 .14 50.6

Schedules,etc.

7,000 .10 36.0

Total 70,000 1.00 360.0

ramesh.anbanandam@gmail.com

59

Complaints by Passengers

Stations, Etc.

40%Train

Performance

21%

Equipment

15%

Personnel

14%

Schedules,

Etc.

10%

ramesh.anbanandam@gmail.com 60

Second

Quarter Truck Production

2d QuarterTruck

ProductionCompany

A

B

C

D

ETotals

357,411

354,936

160,997

34,099

12,747920,190

ramesh.anbanandam@gmail.com

Page 11

61

39%

39%

17%

4%1%

A B C D E

Second Quarter

Truck Production

ramesh.anbanandam@gmail.com 62

Pie Chart Calculations for

Company A

2d QuarterTruck

ProductionProportion DegreesCompany

A

B

C

D

ETotals

357,411

354,936

160,997

34,099

12,747920,190

.388

.386

.175

.037

.0141.000

140

139

63

13

5360

357,411

920,190 =

.388 360 =×

ramesh.anbanandam@gmail.com

63

Pareto Chart

0

10

20

30

40

50

60

70

80

90

100

Poor

Wiring

Short in

Coil

Defective

Plug

Other

Fre

qu

ency

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

ramesh.anbanandam@gmail.com 64

Scatter Plot

Registered Vehicles (1000's)

Gasoline Sales (1000's of

Gallons)

5 60

15 120

9 90

15 140

7 60

0

100

200

0 5 10 15 20Registered Vehicles

Ga

soli

ne S

ale

s

ramesh.anbanandam@gmail.com

Principles of Excellent Graphs

� The graph should not distort the data.

� The graph should not contain unnecessary

adornments (sometimes referred to as chart junk).

� The scale on the vertical axis should begin at zero.

� All axes should be properly labeled.

� The graph should contain a title.

� The simplest possible graph should be used for a

given set of data.

Graphical Errors: Chart Junk

1960: $1.00

1970: $1.60

1980: $3.10

1990: $3.80

Minimum Wage

Bad Presentation

Minimum Wage

0

2

4

1960 1970 1980 1990

$

� Good Presentation

Page 12

Graphical Errors:

Compressing the Vertical Axis

Good Presentation

Quarterly Sales Quarterly Sales

Bad Presentation

0

25

50

Q1 Q2 Q3 Q4

$

0

100

200

Q1 Q2 Q3 Q4

$

Graphical Errors: No Zero Point

on the Vertical Axis

Monthly Sales

36

39

42

45

J F M A M J

$

Graphing the first six months of sales

Monthly Sales

0

39

42

45

J F M A M J

$

36

�Good PresentationsBad Presentation

69

Thank You

• http://www.stats.gla.ac.uk/steps/glossary/presenting_data.html

• http://www.ilir.uiuc.edu/courses/lir593/

top related