brought to you by tutorial support services the math center
TRANSCRIPT
Brought to you by Tutorial Support Services
The Math Center
Statistics is the study of how to collect, organize, analyze, and interpret numerical information.
Descriptive statistics generally characterizes or describes a set of data elements by graphically displaying the information or describing its central tendencies and how it is distributed.
Inferential statistics tries to infer information about a population by using information gathered by sampling.
Population: The complete set of data elements where N refers to the Population Size.
Sample: A portion of a population selected for further analysis.
Midrange: The arithmetic mean of the highest and lowest data elements.
Parameter: A characteristic of the whole population.
Statistic: A characteristic of a sample, presumably measurable.
The Arithmetic Mean is obtained by summing all elements of the data set and dividing by the number of elements:
The Sample Size is the number of elements in a sample. It is referred to by the symbol n, whereas x refers to each element in the data set.
The Mode is the data element which occurs most frequently.
n
xxMean
The Median is the middle element when the data set is arranged in order of magnitude. 1. When n is odd, simply take the middle value of the
data set.2. When n is even, take the sum of the two middle
values, leaving the same amount of even numbers before these two values and the same amount after them, and divide by 2.
The Midrange is the arithmetic mean of the highest and lowest data element:
2
minmax xxMidrange
Example: A sample of size 9 (n=9) is taken of student quiz scores with the following results: 5, 6, 7, 7, 8, 8, 8, 9.5, 10
Answer: The mean is : The median is: 8 (since this is the middle
element) The Mode is 8, since it is the data value
which appears in the distribution the most frequently
The Midrange is:
61.79
105.98887765
5.72
15
2
510
Range is the difference between the highest and lowest data element.
The Standard deviation is another way to calculate dispersion. This is the most common and useful measure because it is the average distance of each score from the mean. The formula for sample standard deviation is as follows:
The Population Standard Deviation is as follows:
Notice the difference between the sample and population standard deviations. The sample standard deviation uses (n-1) in the denominator, hence is slightly larger than the population standard deviation which uses N (which is often written as n).
Variance is the third method of measuring dispersion:
1
2
n
xxs
N
xx
n
xxs
2
2
2
2 ;1
N
xx
2
First, we want to calculate the mean and sample standard deviation of the following distribution: 1, 2, 3, 4, 5. We calculate our mean, and it is:
Now we construct a table in which to keep track of our data:
1 -2 4
2 -1 1
3 0 0
4 1 1
5 2 4
35
15
5
54321
Mean
x xx 2xx
We now want to find the sum of : 4+1+0+1+4=10.
The total number of values is N=5. To find N-1, subtract 1 from 5 to get 4.
Now we find the sample standard deviation:
2xx
58113.1
4
10
1
2
n
xxs
Using the formula for the population standard deviation gives us the following:
The variance of our distribution 1, 2, 3, 4, 5 is:
Squaring σ gives us:
414.1
5
102
N
xx
22 58113.1sVariance
21414.1 22
Descriptive Statistics Handout