1. presenting and summarizing data
TRANSCRIPT
-
7/29/2019 1. Presenting and Summarizing Data
1/35
Presenting and
summarizing data
Spori Goran, PhD.
http://kif.hr/predmet/mki
http://www.science4performance.com/
http://kif.hr/predmet/mkihttp://kif.hr/predmet/mki -
7/29/2019 1. Presenting and Summarizing Data
2/35
Statistics ?
The Science of collecting,organizing, analyzing,
interpreting and presenting data
-
7/29/2019 1. Presenting and Summarizing Data
3/35
Topics we will review
Descriptive Statistics
Frequency Distributions and Histograms
Relative / Cumulative Frequency
Measures ofCentral Tendency
Mean, Median, Mode, Midrange
-
7/29/2019 1. Presenting and Summarizing Data
4/35
Topics (continued)
Measures ofDispersion (Variation)
Range, Standard Deviation,
Variance and Coefficient of variation Shape
Symmetric, Skewed, using Box-and-
Whisker Plots Quartile
Statistical Relationships
Correlation , Covariance
-
7/29/2019 1. Presenting and Summarizing Data
5/35
A collection of quantitative measures and
ways of describing data. This includes:
Frequency distributions & histograms,
measures of central tendency
and
measures of dispersion
Descriptive Statistics
-
7/29/2019 1. Presenting and Summarizing Data
6/35
Descriptive Statistics
Collect Data e.g. Survey
Present Data e.g. Tables and Graphs
Characterize Data e.g. Meann
xi
A Characteristic of a:
Population is a Parameter
Sample is a Statistic.
-
7/29/2019 1. Presenting and Summarizing Data
7/35
Collection of Data
Survey/questionnaires/interviews
Direct observation
Secondary data source (e.g., Medical charts)
-
7/29/2019 1. Presenting and Summarizing Data
8/35
Presenting Data
Graphics
The visual representation of data may be used not
only to present results/findings in the data, butmay also be used to learn about the data.
-
7/29/2019 1. Presenting and Summarizing Data
9/35
Summary Measures in Descriptive
Statistics
Central Tendency
Mean
Median
Mode
Midrange
Quartile
Summary Measures
Variation
Variance
Standard Deviation
Coefficient of
Variation
Range
-
7/29/2019 1. Presenting and Summarizing Data
10/35
Measures of Central Tendency
Central Tendency
Mean Median Mode
Midrange
-
7/29/2019 1. Presenting and Summarizing Data
11/35
The Mean (Arithmetic Average)
It is theArithmetic Averageof data values:
The Most Common Measure of Central Tendency
Affected by Extreme Values(Outliers)
n
xn
1i
in
xxxn2i
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14
Mean = 5 Mean = 6
xSample Mean
-
7/29/2019 1. Presenting and Summarizing Data
12/35
The Median
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14
Median = 5 Median = 5
Important Measure of Central Tendency
In an ordered array, the median is the
middle number.If n isodd, the median is the middle number.
If n iseven, the median is the average of the 2
middle numbers.
Not Affected by Extreme Values
-
7/29/2019 1. Presenting and Summarizing Data
13/35
The Mode
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Mode = 9
A Measure of Central Tendency
Value that Occurs Most Often
Not Affected by Extreme Values
There May Not be a Mode
There May be Several Modes
Used for Either Numerical or Categorical Data
0 1 2 3 4 5 6
No Mode
-
7/29/2019 1. Presenting and Summarizing Data
14/35
Midrange
A Measure of Central Tendency
Average of Smallest and Largest
Observation:
Affected by Extreme Value
2
xxsmallestestl
argMidrange
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Midrange = 5 Midrange = 5
-
7/29/2019 1. Presenting and Summarizing Data
15/35
Summary Measures in Descriptive
Statistics
Central Tendency
Mean
Median
Mode
Midrange
Quartile
Summary Measures
Variation
Variance
Standard Deviation
Coefficient of
Variation
Range
-
7/29/2019 1. Presenting and Summarizing Data
16/35
Quartiles
Not a Measure of Central Tendency
Split Ordered Data into 4 Quarters
Position of i-th Quartile:position of point
25% 25% 25% 25%
Q1 Q2 Q3
Q
i(n+1)
i
4
Data in Ordered Array: 11 12 13 16 16 17 18 21 22
Position of Q1 = 2.50 Q1 =12.5= 1(9 + 1)4
-
7/29/2019 1. Presenting and Summarizing Data
17/35
Quartiles
Not a Measure of Central Tendency
Split Ordered Data into 4 Quarters
Position of i-th Quartile:position of point
25% 25% 25% 25%
Q1 Q2 Q3
Q
i(n+1)
i
4
Data in Ordered Array: 11 12 13 16 16 17 18 21 22
Position of Q3 = 7.50 Q3 =19.5= 3(9 + 1)4
-
7/29/2019 1. Presenting and Summarizing Data
18/35
Summary Measures
Central Tendency
Mean
Median
Mode
Midrange
Quartile
Summary Measures
Variation
Variance
Standard Deviation
Coefficient of
Variation
Range
-
7/29/2019 1. Presenting and Summarizing Data
19/35
Measures of Dispersion (Variation)
Variation
Variance Standard Deviation Coefficient ofVariation
Population
Variance
Sample
Variance
Population
Standard
Deviation
Sample
Standard
Deviation
Range
-
7/29/2019 1. Presenting and Summarizing Data
20/35
Understanding Variation
The more Spread out or dispersed data
the larger the measures of variation
The more concentrated or homogenous the data
the smaller the measures of variation
If all observations are equal
measures of variation = Zero
All measures of variation are Nonnegative
-
7/29/2019 1. Presenting and Summarizing Data
21/35
Measure of Variation
Difference Between Largest & Smallest
Observations:
Range =
Ignores How Data Are Distributed:
The Range
SmallestrgestLa xx
7 8 9 10 11 12
Range = 12 - 7 = 5
7 8 9 10 11 12
Range = 12 - 7 = 5
-
7/29/2019 1. Presenting and Summarizing Data
22/35
Important Measure of Variation
Shows Variation About the Mean:
For the Population:
For the Sample:
Variance
N
Xi 22
1
2
2
n
XXs i
For the Population: use N in the
denominator.
For the Sample : use n - 1
in the denominator.
-
7/29/2019 1. Presenting and Summarizing Data
23/35
Most Important Measure of Variation
Shows Variation About the Mean:
For the Population:
For the Sample:
Standard Deviation
N
Xi 2
1
2
n
XXs i
For thePopulation:use N in the
denominator.
For theSample : use n - 1
in the denominator.
-
7/29/2019 1. Presenting and Summarizing Data
24/35
Sample Standard Deviation
1
2
n
XX iFor the Sample : use n - 1
in the denominator.
Data: 10 12 14 15 17 18 18 24
s =
n = 8 Mean =16
18
16241618161716151614161216102222222
)()()()()()()(
= 4.2426
s
:Xi
-
7/29/2019 1. Presenting and Summarizing Data
25/35
Comparing Standard Deviations
1
2
n
XXis = = 4.2426
N
X i
2 = 3.9686
Value for the Standard Deviation islargerfor data considered as aSample.
Data : 10 12 14 15 17 18 18 24:X i
N= 8 Mean =16
-
7/29/2019 1. Presenting and Summarizing Data
26/35
Comparing Standard Deviations
Mean = 15.5
s = 3.33811 12 13 14 15 16 17 18 19 20 21
11 12 13 14 15 16 17 18 19 20 21
DataB
Data A
Mean = 15.5
s =.9258
11 12 13 14 15 16 17 18 19 20 21
Mean = 15.5
s =4.57
Data C
-
7/29/2019 1. Presenting and Summarizing Data
27/35
Coefficient of Variation
Measure ofRelative Variation
Always a%
Shows Variation Relative to Mean
Used toCompare 2 or More Groups
Formula ( for Sample):
100%
X
SCV
-
7/29/2019 1. Presenting and Summarizing Data
28/35
Comparing Coefficient of Variation
Group A: Average Health Measure =50
Standard Deviation= 5
Group B: Average Health Measure = 100
Standard Deviation =5
100%
X
SCV
Coefficient of Variation:Group A: CV= 10%
Group B: CV= 5%
-
7/29/2019 1. Presenting and Summarizing Data
29/35
Shape
Describes How Data Are Distributed
Measures of Shape:
Symmetric or skewed
-
7/29/2019 1. Presenting and Summarizing Data
30/35
Shape
Describes How Data Are Distributed
Measures of Shape:
Symmetric or skewed
Symmetric
Mean = Median = Mode
-0.5
-
7/29/2019 1. Presenting and Summarizing Data
31/35
Shape
Describes How Data Are Distributed
Measures of Shape:
Symmetric or skewed
Left-Skewed Symmetric
Mean = Median = ModeMean Median Mod
e
< -1 -0.5
-
7/29/2019 1. Presenting and Summarizing Data
32/35
Shape
Describes How Data Are Distributed
Measures of Shape:
Symmetric or skewed
Right-SkewedLeft-Skewed Symmetric
Mean = Median = ModeMean Median Mode Median MeanMode
< -1 > 1-0.5
-
7/29/2019 1. Presenting and Summarizing Data
33/35
Box-and-Whisker Plot
Graphical Display of Data Using
5-Number Summary
Median
4 6 8 10 12
Q3
Q1
Xlargest
Xsmallest
S &
-
7/29/2019 1. Presenting and Summarizing Data
34/35
Distribution Shape &Box-and-Whisker Plots
Right-SkewedLeft-Skewed Symmetric
Q1
Median Q3
Q1 Median Q3 Q1 Median Q3
-
7/29/2019 1. Presenting and Summarizing Data
35/35
Summary
Discussed Measures ofCentral Tendency
Mean, Median, Mode, Midrange
Quartiles
Addressed Measures ofVariation The Range, Interquartile Range, Variance, Standard Deviation, Coefficient ofVariation
Determined Shape of Distributions
Symmetric, Skewed, Box-and-Whisker PlotMean= Median =ModeMean Median Mode Mode Median Mean