06 making sense of data

17
14-04-2012 1 Research Methodology Dr. Nimit Chowdhary, Professor The ‘population’ in statistics includes all members of a defined group that we are studying or collecting information on for data driven decisions. A part of the population is called a sample. It is a proportion of the population, a slice of it, a part of it and all its characteristics. Saturday, April 14, 2012 © Dr. Nimit Chowdhary Research Methodology Workshop p. 2

Upload: iittm

Post on 18-Nov-2014

317 views

Category:

Technology


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: 06 making sense of data

14-04-2012

1

Research Methodology Dr. Nimit Chowdhary, Professor

The ‘population’ in statistics includes all members of a defined group that we are studying or collecting information on for data driven decisions.

A part of the population is called a sample. It is a proportion of the population, a slice of it, a part of it and all its characteristics.

Saturday, April 14, 2012 © Dr. Nimit Chowdhary Research Methodology Workshop p. 2

Page 2: 06 making sense of data

14-04-2012

2

© Dr. Nimit Chowdhary Research Methodology Workshop p. 3

Saturday, April 14, 2012

In industry, business, and government the mass of data that has been collected is often voluminous. Even one item, such as the number of daily billing errors of a large restaurant, can represent such a mass of data that can be more confusing than helpful!!!

© Dr. Nimit Chowdhary Research Methodology Workshop p. 4

Saturday, April 14, 2012

0 1 3 0 1 0 1 01 5 4 1 2 1 2 01 0 2 0 0 2 0 12 1 1 1 2 1 10 4 1 3 1 1 11 3 4 0 0 0 01 3 0 1 2 2 3

Do you understand anything?Does this make any sense?

Page 3: 06 making sense of data

14-04-2012

3

© Dr. Nimit Chowdhary Research Methodology Workshop p. 5

No. of Tally Marks FrequencyNonconforming

0 IIII IIII IIII 151 IIII IIII IIII IIII 202 IIII III 83 IIII 54 III 35 I 1

© Dr. Nimit Chowdhary Research Methodology Workshop p. 6

0

5

10

15

20

25

1 2 3 4 5 6

Page 4: 06 making sense of data

14-04-2012

4

© Dr. Nimit Chowdhary Research Methodology Workshop p. 7

No. of Relative Relative Nonconforming Frequency Frequency Frequency

0 15 15/52= 0.291 20 20/52= 0.382 8 8/52= 0.153 5 5/52= 0.104 3 3/52= 0.065 1 1/52= 0.02

Total 52 1.00

© Dr. Nimit Chowdhary Research Methodology Workshop p. 8

0.000.050.100.150.200.250.300.350.40

1 2 3 4 5 6

Page 5: 06 making sense of data

14-04-2012

5

© Dr. Nimit Chowdhary Research Methodology Workshop p. 9

Relative Relative No. of Cumuative Cumuative Cumulative Cumulative

Nonconforming Frequency Frequency Frequency Frequency Frequency0 15 15 15/52= 0.291 20 15+20= 35 35/52= 0.672 8 35+08= 43 43/52= 0.833 5 43+05= 48 48/52= 0.924 3 48+03= 51 51/52= 0.985 1 51+01= 52 52/52= 1.00

Total 52

© Dr. Nimit Chowdhary Research Methodology Workshop p. 10

0

10

20

30

40

50

60

1 2 3 4 5 6

Page 6: 06 making sense of data

14-04-2012

6

© Dr. Nimit Chowdhary Research Methodology Workshop p. 11

0.000.100.200.300.400.500.600.700.800.901.00

1 2 3 4 5 6

Saturday, April 14, 2012 © Dr. Nimit Chowdhary Research Methodology Workshop p. 12

The science that deals with the collection, tabulation, analysis, interpretation, and presentation of quantitative data.

Page 7: 06 making sense of data

14-04-2012

7

© Dr. Nimit Chowdhary Research Methodology Workshop p. 13 Saturday, April 14, 2012

Descriptive or deductive statisticsWhich endeavors to describe and analyze a subject or group

Inductive statisticsWhich endeavors to determine from a limited amount of data (sample) an important conclusions about a much larger amount of data (population)

Page 8: 06 making sense of data

14-04-2012

8

Who is a better batsman?

Saturday, April 14, 2012 © Dr. Nimit Chowdhary Research Methodology Workshop p. 15

BatterScores in 5

consecutive inningsSachinT 0 25 50 75 100Rahul D 40 45 50 55 60Virendra S 0 20 30 30 120

Position/ location Variation

Symmetry Peakedness

Saturday, April 14, 2012 © Dr. Nimit Chowdhary Research Methodology Workshop p. 16

Page 9: 06 making sense of data

14-04-2012

9

Measure of central tendency

Measure of

dispersion

Skewness Kurtosis

Saturday, April 14, 2012 © Dr. Nimit Chowdhary Research Methodology Workshop p. 17

Descriptive Statistics

Measure oflocation

Graphic displays

Measure of dispersion

Mean

Median

Mode

Standard deviation

Variance

Range

Mean deviation

Freq. distri. table

Freq. distri. polygon

Histogram

Bar diagram

Page 10: 06 making sense of data

14-04-2012

10

© Dr. Nimit Chowdhary Research Methodology 19 Saturday, April 14, 2012

Mean

Median

Mode

Trimmed mean

Arithmetic meanGeometric Mean

Harmonic Mean

© Dr. Nimit Chowdhary Research Methodology 20 Saturday, April 14, 2012

Range Inter-quartile range Mean Deviation Standard Deviation Variance

Page 11: 06 making sense of data

14-04-2012

11

59 56 66 46 61 44 50 52 36 7070 46 65 43 38 64 53 58 38 6460 60 45 51 68 52 43 62 71 5346 55 51 54 74 39 69 52 36 3768 72 50 56 51 32 52 75 45 5161 60 64 67 60 47 37 47 33 3851 62 42 49 61 71 45 45 56 4356 50 61 58 56 51 69 59 34 6159 57 32 75 51 67 72 58 42 7450 59 65 52 60 70 42 52 51 5334 47 69 59 49 46 31 63 54 44

Following are the scores of 110 students in Statistics

Score Tabulation Score Tabulation Score Tabulation Score Tabulation Score Tabulation31 I 41 51 IIII III 61 IIII 71 II32 II 42 III 52 IIII I 62 II 72 II33 I 43 III 53 III 63 I 7334 II 44 II 54 II 64 III 74 II35 45 IIII 55 I 65 II 75 II36 II 46 IIII 56 IIII 66 I37 II 47 III 57 I 67 II38 III 48 58 III 68 II39 I 49 II 59 IIII 69 III40 50 III 60 IIII 70 III

Does this make any sense?

Page 12: 06 making sense of data

14-04-2012

12

Class Criterion FrequencyFail Less than 39 14Pass 40-49 21II Division 50-59 38I Division 60-69 26Honours 70 plus 11

110Saturday, April 14, 2012 © Dr. Nimit Chowdhary Research Methodology Workshop p. 23

Determine the range

Determine the cell interval

Determine the number of classes

Determine the cell midpoints

Determine the cell boundaries

Post the cell frequencies

Page 13: 06 making sense of data

14-04-2012

13

lh XXR

l

h

XXR = Range

= Largest number

= Smallest number

443175

RXXR lh

Interval (i)One way is to use Sturgis’ Rule:

nRi

log322.31

In our case n = 110 and R = 0.044

7.5)041.2(322.31

44log322.31

n

Ri

i 6

Page 14: 06 making sense of data

14-04-2012

14

iRk

8333.76

44

kiRk

In our case i = 0.0057 0.006

Remember,

ik 1

22

44482´,

048.068´,

RRNow

ikRLet

Lower limit of first class is:

And, Upper limit of last class is:

29231 lX

77275 hX

Starting here, we can create classes by adding 6 (= i)

Page 15: 06 making sense of data

14-04-2012

15

Cell Cell FrequencyBoundaries Midpoint

29 - 35 32 636 - 42 39 1143 - 49 46 1850 - 56 53 2957 - 63 56 2264 - 70 67 1671 - 77 74 8

110Total

6

11

18

29

22

16

8

0

5

10

15

20

25

30

35

32 39 46 53 60 67 74

Page 16: 06 making sense of data

14-04-2012

16

05

101520253035

32 39 46 53 60 67 74

0

20

40

60

80

100

120

32 39 46 53 60 67 74

Page 17: 06 making sense of data

14-04-2012

17

Making sense of phenomenon- population or sample

Four dimensions of phenomenon Central tendency Dispersion Symmetry Peakedness

Grouped vs. ungrouped data

Saturday, April 14, 2012 © Dr. Nimit Chowdhary Research Methodology Workshop p. 33