Download - 06 making sense of data
14-04-2012
1
Research Methodology Dr. Nimit Chowdhary, Professor
The ‘population’ in statistics includes all members of a defined group that we are studying or collecting information on for data driven decisions.
A part of the population is called a sample. It is a proportion of the population, a slice of it, a part of it and all its characteristics.
Saturday, April 14, 2012 © Dr. Nimit Chowdhary Research Methodology Workshop p. 2
14-04-2012
2
© Dr. Nimit Chowdhary Research Methodology Workshop p. 3
Saturday, April 14, 2012
In industry, business, and government the mass of data that has been collected is often voluminous. Even one item, such as the number of daily billing errors of a large restaurant, can represent such a mass of data that can be more confusing than helpful!!!
© Dr. Nimit Chowdhary Research Methodology Workshop p. 4
Saturday, April 14, 2012
0 1 3 0 1 0 1 01 5 4 1 2 1 2 01 0 2 0 0 2 0 12 1 1 1 2 1 10 4 1 3 1 1 11 3 4 0 0 0 01 3 0 1 2 2 3
Do you understand anything?Does this make any sense?
14-04-2012
3
© Dr. Nimit Chowdhary Research Methodology Workshop p. 5
No. of Tally Marks FrequencyNonconforming
0 IIII IIII IIII 151 IIII IIII IIII IIII 202 IIII III 83 IIII 54 III 35 I 1
© Dr. Nimit Chowdhary Research Methodology Workshop p. 6
0
5
10
15
20
25
1 2 3 4 5 6
14-04-2012
4
© Dr. Nimit Chowdhary Research Methodology Workshop p. 7
No. of Relative Relative Nonconforming Frequency Frequency Frequency
0 15 15/52= 0.291 20 20/52= 0.382 8 8/52= 0.153 5 5/52= 0.104 3 3/52= 0.065 1 1/52= 0.02
Total 52 1.00
© Dr. Nimit Chowdhary Research Methodology Workshop p. 8
0.000.050.100.150.200.250.300.350.40
1 2 3 4 5 6
14-04-2012
5
© Dr. Nimit Chowdhary Research Methodology Workshop p. 9
Relative Relative No. of Cumuative Cumuative Cumulative Cumulative
Nonconforming Frequency Frequency Frequency Frequency Frequency0 15 15 15/52= 0.291 20 15+20= 35 35/52= 0.672 8 35+08= 43 43/52= 0.833 5 43+05= 48 48/52= 0.924 3 48+03= 51 51/52= 0.985 1 51+01= 52 52/52= 1.00
Total 52
© Dr. Nimit Chowdhary Research Methodology Workshop p. 10
0
10
20
30
40
50
60
1 2 3 4 5 6
14-04-2012
6
© Dr. Nimit Chowdhary Research Methodology Workshop p. 11
0.000.100.200.300.400.500.600.700.800.901.00
1 2 3 4 5 6
Saturday, April 14, 2012 © Dr. Nimit Chowdhary Research Methodology Workshop p. 12
The science that deals with the collection, tabulation, analysis, interpretation, and presentation of quantitative data.
14-04-2012
7
© Dr. Nimit Chowdhary Research Methodology Workshop p. 13 Saturday, April 14, 2012
Descriptive or deductive statisticsWhich endeavors to describe and analyze a subject or group
Inductive statisticsWhich endeavors to determine from a limited amount of data (sample) an important conclusions about a much larger amount of data (population)
14-04-2012
8
Who is a better batsman?
Saturday, April 14, 2012 © Dr. Nimit Chowdhary Research Methodology Workshop p. 15
BatterScores in 5
consecutive inningsSachinT 0 25 50 75 100Rahul D 40 45 50 55 60Virendra S 0 20 30 30 120
Position/ location Variation
Symmetry Peakedness
Saturday, April 14, 2012 © Dr. Nimit Chowdhary Research Methodology Workshop p. 16
14-04-2012
9
Measure of central tendency
Measure of
dispersion
Skewness Kurtosis
Saturday, April 14, 2012 © Dr. Nimit Chowdhary Research Methodology Workshop p. 17
Descriptive Statistics
Measure oflocation
Graphic displays
Measure of dispersion
Mean
Median
Mode
Standard deviation
Variance
Range
Mean deviation
Freq. distri. table
Freq. distri. polygon
Histogram
Bar diagram
14-04-2012
10
© Dr. Nimit Chowdhary Research Methodology 19 Saturday, April 14, 2012
Mean
Median
Mode
Trimmed mean
Arithmetic meanGeometric Mean
Harmonic Mean
© Dr. Nimit Chowdhary Research Methodology 20 Saturday, April 14, 2012
Range Inter-quartile range Mean Deviation Standard Deviation Variance
14-04-2012
11
59 56 66 46 61 44 50 52 36 7070 46 65 43 38 64 53 58 38 6460 60 45 51 68 52 43 62 71 5346 55 51 54 74 39 69 52 36 3768 72 50 56 51 32 52 75 45 5161 60 64 67 60 47 37 47 33 3851 62 42 49 61 71 45 45 56 4356 50 61 58 56 51 69 59 34 6159 57 32 75 51 67 72 58 42 7450 59 65 52 60 70 42 52 51 5334 47 69 59 49 46 31 63 54 44
Following are the scores of 110 students in Statistics
Score Tabulation Score Tabulation Score Tabulation Score Tabulation Score Tabulation31 I 41 51 IIII III 61 IIII 71 II32 II 42 III 52 IIII I 62 II 72 II33 I 43 III 53 III 63 I 7334 II 44 II 54 II 64 III 74 II35 45 IIII 55 I 65 II 75 II36 II 46 IIII 56 IIII 66 I37 II 47 III 57 I 67 II38 III 48 58 III 68 II39 I 49 II 59 IIII 69 III40 50 III 60 IIII 70 III
Does this make any sense?
14-04-2012
12
Class Criterion FrequencyFail Less than 39 14Pass 40-49 21II Division 50-59 38I Division 60-69 26Honours 70 plus 11
110Saturday, April 14, 2012 © Dr. Nimit Chowdhary Research Methodology Workshop p. 23
Determine the range
Determine the cell interval
Determine the number of classes
Determine the cell midpoints
Determine the cell boundaries
Post the cell frequencies
14-04-2012
13
lh XXR
l
h
XXR = Range
= Largest number
= Smallest number
443175
RXXR lh
Interval (i)One way is to use Sturgis’ Rule:
nRi
log322.31
In our case n = 110 and R = 0.044
7.5)041.2(322.31
44log322.31
n
Ri
i 6
14-04-2012
14
iRk
8333.76
44
kiRk
In our case i = 0.0057 0.006
Remember,
ik 1
22
44482´,
048.068´,
RRNow
ikRLet
Lower limit of first class is:
And, Upper limit of last class is:
29231 lX
77275 hX
Starting here, we can create classes by adding 6 (= i)
14-04-2012
15
Cell Cell FrequencyBoundaries Midpoint
29 - 35 32 636 - 42 39 1143 - 49 46 1850 - 56 53 2957 - 63 56 2264 - 70 67 1671 - 77 74 8
110Total
6
11
18
29
22
16
8
0
5
10
15
20
25
30
35
32 39 46 53 60 67 74
14-04-2012
16
05
101520253035
32 39 46 53 60 67 74
0
20
40
60
80
100
120
32 39 46 53 60 67 74
14-04-2012
17
Making sense of phenomenon- population or sample
Four dimensions of phenomenon Central tendency Dispersion Symmetry Peakedness
Grouped vs. ungrouped data
Saturday, April 14, 2012 © Dr. Nimit Chowdhary Research Methodology Workshop p. 33