chi square goodness of fit
DESCRIPTION
Chi square goodness of fitTRANSCRIPT
What is a Chi-Square Test of Goodness of Fit?
Questions of goodness of fit have become increasingly important in modern statistics.
Questions of goodness of fit juxtapose complex observed patterns against hypothesized or previously observed patterns
to test overall and specific differences among
them.
Observed Hypothesized Difference
Observed Hypothesized Difference
If the difference is small then the FIT IS GOOD
Observed Hypothesized Difference
If the difference is small then the FIT IS GOOD
Observed Hypothesized Difference
Observed Hypothesized Difference
If the difference is small then the FIT IS GOOD
Observed Hypothesized Difference
For example:
Observed Hypothesized Difference
If the difference is small then the FIT IS GOOD
Observed Hypothesized Difference
For example:
51% Females 50% Females 1%
Observed Hypothesized Difference
Observed Hypothesized Difference
If the difference is BIG then the FIT IS NOT GOOD
Observed Hypothesized Difference
If the difference is BIG then the FIT IS NOT GOOD
Observed Hypothesized Difference
Observed Hypothesized Difference
If the difference is BIG then the FIT IS NOT GOOD
Observed Hypothesized Difference
For example:
Observed Hypothesized Difference
If the difference is BIG then the FIT IS NOT GOOD
Observed Hypothesized Difference
For example:
50% Females 22% Females
18%
Here is an example:
Here is an example:We want to know if a sample we have selected has the national percentages of a certain ethnic groups.
Here is an example:We want to know if a sample we have selected has the national percentages of a certain ethnic groups.
2% of sample is made of
members of this ethnic
group
10% of the population is made of this ethnic group
8% Difference
You will use certain statistical methods to determine if the goodness of fit is
significant or not.
You will use certain statistical methods to determine if the goodness of fit is
significant or not.
Here is an example:
You will use certain statistical methods to determine if the goodness of fit is
significant or not.
Here is an example:Problem – The chair of a statistics department suspects that some of her faculty are more popular with students than others.
There are three sections of introductory stats that are taught at the same time in the morning by Professors Cauforek, Kerr, and Rector.
There are three sections of introductory stats that are taught at the same time in the morning by Professors Cauforek, Kerr, and Rector.66 students are planning on enrolling in one of the three classes.
What would you expect the number of enrollees to be in each class if popularity were not an issue?
Professor Cauforek Professor Kerr Professor Rector
22 22 22
What would you expect the number of enrollees to be in each class if popularity were not an issue?
Professor Cauforek Professor Kerr Professor Rector
22 22 22
What would you expect the number of enrollees to be in each class if popularity were not an issue?
This is our expected value.
Now let’s see what was observed.
Now let’s see what was observed.The number who enroll for each class was:
Now let’s see what was observed.The number who enroll for each class was:
Professor Cauforek Professor Kerr Professor Rector
31 25 10
We will test the degree to which the observed data...
We will test the degree to which the observed data...
Professor Cauforek Professor Kerr Professor Rector
31 25 10
We will test the degree to which the observed data...
…fits the expected enrollments.
Professor Cauforek Professor Kerr Professor Rector
31 25 10
We will test the degree to which the observed data...
…fits the expected enrollments.
Professor Cauforek Professor Kerr Professor Rector
31 25 10
Professor Cauforek Professor Kerr Professor Rector
22 22 22
Here is the formula:
Here is the formula:
𝑥2=Σ(𝑂−𝐸)2
𝐸
Where:
𝑥2=Σ(𝑂−𝐸)2
𝐸
Where:
𝑥2=Σ(𝑂−𝐸)2
𝐸
𝒙𝟐= h𝐶 𝑖𝑆𝑞𝑢𝑎𝑟𝑒
Where:
𝑥2=Σ(𝑂−𝐸)2
𝐸
𝒙𝟐= h𝐶 𝑖𝑆𝑞𝑢𝑎𝑟𝑒
𝒙𝟐=Σ(𝑂−𝐸)2
𝐸
𝚺=𝑆𝑢𝑚𝑜𝑓
𝚺=𝑆𝑢𝑚𝑜𝑓
𝑥2=𝚺 (𝑂−𝐸)2
𝐸
𝐎=𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑𝑠𝑐𝑜𝑟𝑒
𝐎=𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑𝑠𝑐𝑜𝑟𝑒
𝑥2=Σ(𝑶−𝐸)2
𝐸
𝐎=𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑𝑠𝑐𝑜𝑟𝑒
𝑥2=Σ(𝑶−𝐸)2
𝐸
Professor Cauforek Professor Kerr Professor Rector
31 25 10
𝐎=𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑𝑠𝑐𝑜𝑟𝑒
𝑥2=Σ(𝑶−𝐸)2
𝐸
Professor Cauforek Professor Kerr Professor Rector
31 25 10
𝑬=𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑠𝑐𝑜𝑟𝑒
𝑬=𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑠𝑐𝑜𝑟𝑒
𝑥2=Σ(𝑂−𝑬 )2
𝐸
𝑬=𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑠𝑐𝑜𝑟𝑒
𝑥2=Σ(𝑂−𝑬 )2
𝐸
Professor Cauforek Professor Kerr Professor Rector
22 22 22
𝑬=𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑠𝑐𝑜𝑟𝑒
𝑥2=Σ(𝑂−𝑬 )2
𝐸
Professor Cauforek Professor Kerr Professor Rector
22 22 22
𝑬=𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑠𝑐𝑜𝑟𝑒
𝑥2=Σ(𝑂−𝐸)2
𝑬
Professor Cauforek Professor Kerr Professor Rector
22 22 22
Here is the null-hypothesis:
Here is the null-hypothesis:
There is no significant difference between the expected and the observed number of students
enrolled in three stats professors’ classes.
Now we will compute the value and compare it with the critical value.
Now we will compute the value and compare it with the critical value.• If the value exceeds the critical value, then we
will reject the null-hypothesis.
Now we will compute the value and compare it with the critical value.• If the value exceeds the critical value, then we
will reject the null-hypothesis.• If the value DOES NOT exceed the critical
value, then we will fail to reject the null-hypothesis.
Let’s compute the value.
Let’s compute the value. Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 31 25 10
Let’s compute the value. Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 31 25 10
𝑥2=𝚺 (𝑂−𝐸)2
𝐸
Let’s compute the value.
OR
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 31 25 10
𝑥2=𝚺 (𝑂−𝐸)2
𝐸
Let’s compute the value.
OR
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 31 25 10
𝑥2=𝚺 (𝑂−𝐸)2
𝐸
𝑥2=(𝑂−𝐸)2
𝐸+(𝑂−𝐸)2
𝐸+
(𝑂−𝐸)2
𝐸
Let’s compute the value.
OR
𝑥2=(𝑂−𝐸)2
𝐸+(𝑂−𝐸)2
𝐸+
(𝑂−𝐸)2
𝐸
𝑥2=𝚺 (𝑂−𝐸)2
𝐸
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 31 25 10
Let’s input each professor’s data into the equation.
Let’s input each professor’s data into the equation.
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 31 25 10
Let’s input each professor’s data into the equation.
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 31 25 10
𝑥2=(𝟑𝟏−𝐸)2
𝐸+(𝑂−𝐸)2
𝐸+(𝑂−𝐸)2
𝐸
Let’s input each professor’s data into the equation.
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 31 25 10
𝑥2=(31−𝟐𝟐)2
𝐸+(𝑂−𝐸)2
𝐸+(𝑂−𝐸)2
𝐸
Let’s input each professor’s data into the equation.
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 31 25 10
𝑥2=(31−22)2
𝟐𝟐+
(𝑂−𝐸)2
𝐸+
(𝑂−𝐸)2
𝐸
Let’s input each professor’s data into the equation.
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 31 25 10
𝑥2=(31−22)2
22+
(𝟐𝟓−𝐸)2
𝐸+(𝑂−𝐸)2
𝐸
Let’s input each professor’s data into the equation.
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 31 25 10
𝑥2=(31−22)2
22+
(25−𝟐𝟐)2
𝟐𝟐+(𝑂−𝐸)2
𝐸
Let’s input each professor’s data into the equation.
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 31 25 10
𝑥2=(31−22)2
22+
(25−22)2
22+(𝟏𝟎−𝐸)2
𝐸
Let’s input each professor’s data into the equation.
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22Observed 31 25 10
𝑥2=(31−22)2
22+
(25−22)2
22+(10−𝟐𝟐)2
𝟐𝟐
Now for the calculation:
Now for the calculation:
𝑥2=(31−22)2
22+
(25−22)2
22+(10−22)2
22
Now for the calculation:
𝑥2=(𝟗)2
22+
(25−22)2
22+(10−22)2
22
Now for the calculation:
𝑥2=𝟖𝟏22
+(25−22)2
22+(10−22)2
22
Now for the calculation:
𝑥2=8122
+(𝟑)2
22+(10−22)2
22
Now for the calculation:
𝑥2=8122
+ 𝟗22
+(10−22)2
22
Now for the calculation:
𝑥2=8122
+ 𝟗22
+(−𝟏𝟐)2
22
Now for the calculation:
𝑥2=8122
+922
+𝟏𝟒𝟒22
Convert the fractions into decimals:
𝑥2=8122
+922
+𝟏𝟒𝟒22
Convert the fractions into decimals:
𝑥2=8122
+922
+14422
Convert the fractions into decimals:
𝑥2=𝟑 .𝟕+922
+14422
Convert the fractions into decimals:
𝑥2=3.7+𝟎 .𝟒+14422
Convert the fractions into decimals:
𝑥2=3.7+0.4+𝟔 .𝟓
Sum the terms:
𝑥2=3.7+0.4+6.5
Sum the terms:
𝑥2=10.6
As a contrasting example note what the value would be if the observed and expected values were more similar:
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 24 22 20
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 24 22 20
𝑥2=(𝑂−𝐸)2
𝐸+(𝑂−𝐸)2
𝐸+
(𝑂−𝐸)2
𝐸
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22Observed 24 22 20
𝑥2=(𝑂−𝟐𝟐)2
𝟐𝟐+(𝑂−𝟐𝟐)2
𝟐𝟐+
(𝑂−𝟐𝟐)2
𝟐𝟐
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 24 22 20
𝑥2=(𝟐𝟒−22)2
22+(𝟐𝟐−22)2
22+(𝟐𝟎−22)2
22
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 24 22 20
𝑥2=(𝟐)2
22+
(𝟎)2
22+
(−𝟐)2
22
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 24 22 20
𝑥2=𝟒22
+𝟎22
+𝟒22
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 24 22 20
𝑥2=𝟎 .𝟐+𝟎 .𝟎+𝟎 .𝟐
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 24 22 20
𝑥2=𝟎 .𝟒
So the moral of the story is that the closer the expected and observed values are to one another, the smaller the Chi-square value or the greater the goodness of fit (as seen below).
So the moral of the story is that the closer the expected and observed values are to one another, the smaller the Chi-square value or the greater the goodness of fit (as seen below).
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 31 25 10
So the moral of the story is that the closer the expected and observed values are to one another, the smaller the Chi-square value or the greater the goodness of fit (as seen below).
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 31 25 10
𝑥2=𝟏𝟎 .𝟔
On the other hand, the farther the expected and observed values are from one another the smaller the Chi-square value or the greater the goodness of fit (as seen below).
On the other hand, the farther the expected and observed values are from one another the smaller the Chi-square value or the greater the goodness of fit (as seen below).
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 31 25 10
On the other hand, the farther the expected and observed values are from one another the smaller the Chi-square value or the greater the goodness of fit (as seen below).
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 31 25 10
𝑥2=𝟏𝟎 .𝟔
Now we determine if a of 10.6 exceeds the critical for terms.
To calculate the critical we first must determine the degrees of freedom as well as set the probability level.
To calculate the critical we first must determine the degrees of freedom as well as set the probability level.The probability or alpha level means the probability of a type 1 error we are willing to live with (i.e., this is the probability of being wrong when we reject the null hypothesis).
To calculate the critical we first must determine the degrees of freedom as well as set the probability level.The probability or alpha level means the probability of a type 1 error we are willing to live with (i.e., this is the probability of being wrong when we reject the null hypothesis). Generally this value is 0.5 which is like saying we are willing to be wrong 5 out of 100 times (0.05) before we will reject the null-hypothesis.
Degrees of Freedom are calculated by taking the number of groups and subtracting them by 1. (Three groups minus 1 = 2)
We now have all of the information we need to determine the critical .
We now have all of the information we need to determine the critical .We go to the Chi-Square Distribution Table and locate the degrees of freedom.
We now have all of the information we need to determine the critical .We go to the Chi-Square Distribution Table and locate the degrees of freedom.
df 0.100 0.050 0.025 1 2.71 3.84 5.02 2 4.61 5.99 7.38 3 6.25 7.82 9.35 4 7.78 9.49 11.14 5 9.24 11.07 12.83 6 10.64 12.59 14.45 7 12.02 14.07 16.10 8 13.36 15.51 17.54 9 14.68 16.92 19.20 … … … …
We now have all of the information we need to determine the critical .We go to the Chi-Square Distribution Table and locate the degrees of freedom.And then we locate the probability or alpha level:
df 0.100 0.050 0.025 1 2.71 3.84 5.02 2 4.61 5.99 7.38 3 6.25 7.82 9.35 4 7.78 9.49 11.14 5 9.24 11.07 12.83 6 10.64 12.59 14.45 7 12.02 14.07 16.10 8 13.36 15.51 17.54 9 14.68 16.92 19.20 … … … …
We now have all of the information we need to determine the critical .We go to the Chi-Square Distribution Table and locate the degrees of freedom.And then we locate the probability or alpha level:
df 0.100 0.050 0.025 1 2.71 3.84 5.02 2 4.61 5.99 7.38 3 6.25 7.82 9.35 4 7.78 9.49 11.14 5 9.24 11.07 12.83 6 10.64 12.59 14.45 7 12.02 14.07 16.10 8 13.36 15.51 17.54 9 14.68 16.92 19.20 … … … …
We now have all of the information we need to determine the critical .We go to the Chi-Square Distribution Table and locate the degrees of freedom.And then we locate the probability or alpha level:
df 0.100 0.050 0.025 1 2.71 3.84 5.02 2 4.61 5.99 7.38 3 6.25 7.82 9.35 4 7.78 9.49 11.14 5 9.24 11.07 12.83 6 10.64 12.59 14.45 7 12.02 14.07 16.10 8 13.36 15.51 17.54 9 14.68 16.92 19.20 … … … …
Where these two values intersect in the table we find the critical .
df 0.100 0.050 0.025 1 2.71 3.84 5.02 2 4.61 5.99 7.38 3 6.25 7.82 9.35 4 7.78 9.49 11.14 5 9.24 11.07 12.83 6 10.64 12.59 14.45 7 12.02 14.07 16.10 8 13.36 15.51 17.54 9 14.68 16.92 19.20 … … … …
We now have all of the information we need to determine the critical .We go to the Chi-Square Distribution Table and locate the degrees of freedom.And then we locate the probability or alpha level:
Where these two values intersect in the table we find the critical .
We now have all of the information we need to determine the critical .We go to the Chi-Square Distribution Table and locate the degrees of freedom.And then we locate the probability or alpha level:
df 0.100 0.050 0.025 1 2.71 3.84 5.02 2 4.61 5.99 7.38 3 6.25 7.82 9.35 4 7.78 9.49 11.14 5 9.24 11.07 12.83 6 10.64 12.59 14.45 7 12.02 14.07 16.10 8 13.36 15.51 17.54 9 14.68 16.92 19.20 … … … …
Where these two values intersect in the table we find the critical .
Since the chi-square goodness of fit value (10.6) exceeds the critical (5.99) we will reject the null hypothesis:
Since the chi-square goodness of fit value (10.6) exceeds the critical (5.99) we will reject the null hypothesis:
There is no significant difference between the expected and the observed number of students
enrolled in three stats professors’ classes.
Since the chi-square goodness of fit value (10.6) exceeds the critical (5.99) we will reject the null hypothesis:
There is no significant difference between the expected and the observed number of students
enrolled in three stats professors’ classes.
Since the chi-square goodness of fit value (10.6) exceeds the critical (5.99) we will reject the null hypothesis:
There actually is a significant difference.
There is no significant difference between the expected and the observed number of students
enrolled in three stats professors’ classes.
In summary,
In summary,Questions of goodness of fit juxtapose observed patterns against hypothesized to test overall and specific differences among them.
In summary,Questions of goodness of fit juxtapose observed patterns against hypothesized to test overall and specific differences among them.
Observed Hypothesized Difference