falenk.files.wordpress.com€¦ · web viewdata was collected for all 50 states on the average...
Post on 26-Aug-2020
0 Views
Preview:
TRANSCRIPT
Excel Project 1
Running Head: EXCEL PROJECT
Excel Project: Public School Spending and its Relationship to SAT Scores
Kelly Falen
EDU 6976 Interpreting and Applying Educational Research II
Seattle Pacific University
Spring 2009
Excel Project 2
Part 1
Data was collected for all 50 states on the average expenditure per pupil in public school
and was compared to SAT scores for students in each state to determine whether there is a
relationship between spending and student achievement. In the data being compared and
analyzed is the current per pupil daily spending, student teacher ratio, estimated average annual
salary of the teachers, percentage of students eligible to take the SAT, and the average math,
verbal and total scores for those students who took the SAT. These data were extracted from the
1997 Digest of Education Statistics and will reflect information for the 1994-1995 school year.
Figures 1-3 are histograms, or a visual representation of data in terms of the frequency of
each data point, that depict the following average information: expenditure per student (in
thousands of dollars), student/teacher ratio and teacher salary. All three of these histograms show
that the data does not follow a normal distribution, but instead is positively skewed. In figure 1
the highest frequency of dollars spent per pupil is between five and six thousand. The histogram
in Figure 2 shows the highest frequency of student/teacher ratio is between 16 and 18 students
per teacher. Finally, Figure 3 indicates that the highest frequency of teacher salary is between 30
and 35 thousand dollars.
Excel Project 3
Figure I: Expenditure Per Pupil
Figure II: Student/Teacher Ratio
Excel Project 4
Figure III: Average Annual Teacher Salary
Next, Figures 4-7 indicate the percentage of students eligible to take the SAT, and the
average SAT scores per state, divided into math, verbal and total scores. These distributions do
not follow a normal curve and are bimodal, or there are two sets of scores which have high
frequency in these data sets. The percentage of eligible students in Figure 4 shows that the
highest frequency is between three and eighteen percent, with a smaller spike in frequency
between 63 and 78 percent. Figure 5 indicates that the highest frequency of math SAT scores fall
between 450 and 500 with another large spike between 525 and 550. The highest frequency of
verbal SAT scores, as shown in Figure 6, falls between 400 and 420 and again between 480 and
500, the numbers of scores that fall between these two ranges are the same. Finally, Figure 7
indicates that the highest frequency of total SAT scores fall between 850 and 900 and again
between 1000 and 1050.
Excel Project 5
Figure 4: Percentage of Students Eligible to take the SAT
Figure 5: SAT Scores – Math
Excel Project 6
Figure 6: SAT Scores – Verbal
Figure 7: SAT Scores - Total
Box plots are another way of presenting the data by showing a visual representation of
the distribution in quartiles, the middle 50th percent of the data, the median and any outliers that
may be skewing the data. The first box plot listed, Figure 8, has a median expenditure per pupil
of 5.7675 with a lower hinge of 4.88175 and an upper hinge of 6.434, indicating that the middle
50th percent falls between those two data points. This figure also shows that the data is
Excel Project 7
negatively skewed, as indicated by the larger amount of space on the left side of the box. There
is a trend in the first three box plots, Figures 9-10, that show outliers in the data sets indicating
abnormally high data points for the distribution of data given.
Figure 8: Expenditure per Pupil
In Figure 9 the median ratio of students per teacher is 16.6 with a lower hinge of 15.225
and an upper hinge of 17.575. There are also outliers in this data set and the majority of the
points fall below the median.
Figure 9: Student/Teacher Ratio
Figure 10 shows a median annual teacher salary of 33.2875 with a lower hinge of
30.9775 and an upper hinge of 38.5458. This data set indicates more data points above the
median as well as an outlier just beyond the end of the upper whisker.
Figure 10: Average Annual Teacher Salary (in thousands)
Excel Project 8
In Figure 11 the median is 28% of students who were eligible to take the SAT with the
majority of the data points above that percentage. Most of the data falls within the middle 50th
percent range of 9% to 63%.
Figure 11: Percentage of Students Eligible to take the SAT
Figures 12 – 14 detail the actual SAT Scores and are very similar in their distributions.
The median of the SAT math scores (Figure 12) is 497.5 with a lower hinge of 474.75 and an
upper hinge of 539.5. The median of the SAT verbal scores (Figure 13) is 448 with a lower hinge
of 427.25 and an upper hinge of 490.25. Finally the median for the total SAT scores 945.5 with a
lower hinge of 897.25 and an upper hinge of 1032. In all three of these box plots, the majority of
the scores fall above the mean.
Figure 12: SAT Scores – Math
Figure 13: SAT Scores – Verbal
Excel Project 9
Figure 14: SAT Scores – Total
Finally, Figure 15 shows the frequency distribution for the division of regions in the US
created for the purposes of this study in a simple bar chart format. All of the data indicated above
will be examined in further detail in the analysis section of this paper.
Figure 15: Frequency Distribution by Region
Part 2
The next step in analyzing the data provided is to conduct an analysis of variance
(ANOVA) to determine if there are statistical differences among regions. Once the determination
of statistical significance has been made through ANOVA, the next step is to determine which
regions are different from each other through Tukey’s HDS tests for each pair of regions and
variable. Then the means for each data set will indicate the direction of the difference. Once we
have established statistical significance, eta squared will indicate the effect size or practical
Excel Project 10
significance of each data set, giving a percentage of change for each variable that can be
explained by the differences in each region.
First, looking at expenditure per pupil across the four regions, ANOVA indicates that the
null hypothesis should be rejected. That is to say that there is a statistically significant difference
between the regions. The Tukey HSD test shows that the difference is between regions one and
four as well as regions two and four. According to the confidence intervals, region four is
outspending all three of the other regions. The mean expenditure per pupil for region four (in
thousands of dollars) is 7.21. Region one and two are spending an average of 5.54 and 5.74
respectively and region three is spending, on average, less per pupil than the other three regions
at 4.85. Eta squared is 0.42 which is considered to be a strong effect. This number indicates that
42% of the difference in expenditure per pupil can be explained by regional differences.
Expenditure Per Pupil
ANOVA Table 5%
Source SS df MS F Fcritical
p-value
Between
38.3015 3
12.767
11.143
2.8068
0.0000
Reject
Within52.703
446
1.1457
Total91.004
849
Estimates of Group MeansGroup Confidence Interval
15.5430
8 ±0.5976 95%
25.7415
8 ± 0.622 95%
3 4.847 ±0.6496 95%
47.2133
6 ±0.5758 95%
Tukey test for pairwise comparison of group means 1
r 4 2 2
Excel Project 11
n - r 46 3 3 q0 3.79 4 Sig Sig 4
T1.1710
9
Next, looking at the student/teacher ratio, ANOVA indicates that the null hypothesis
should be rejected because there is a statistically significant difference between the regions. The
Tukey HSD test shows that the significant difference is between regions one and two, regions
one and three and regions one and four. According to the confidence intervals, region one has a
higher ratio of students per teacher at 19.06, whereas regions two three and four have average
ratios of 16.2, 17.08 and 15.2 respectively. Region four has the lowest student/teacher ratio. Eta
squared is 0.43 which is considered to be a strong effect size. This number indicates that 43% of
the difference between student/teacher ratios can be explained by regional differences.
Student/Teacher Ratio
ANOVA Table 5%
Source SS df MS F Fcritical
p-value
Between
107.355 3
35.785
11.405
2.8068
0.0000
Reject
Within144.32
746
3.1375
Total251.68
249
Estimates of Group MeansGroup Confidence Interval
119.061
5 ±0.9889 95%
2 16.2 ±1.0293 95%
317.081
8 ± 1.075 95%
4 15.2 ±0.9529 95%
Tukey test for pairwise comparison of group means 1
r 4 2 Sig 2
Excel Project 12
n - r 46 3 Sig 3 q0 3.79 4 Sig 4
T1.9379
5
Next, looking at the average yearly salary for teachers in each region, again ANOVA
indicates that the null hypothesis must be rejected, that there is a significant difference between
the regions. The only regions that are statistically different in teacher salary according to Tukey’s
HSD test are regions two and four. Region four has the highest average annual teacher salary (in
thousands of dollars) at 39.60. Regions one, two and three have an average teacher salary of
34.71, 33.38 and 30.48 respectively. Region three has the lowest average annual teacher salary.
Eta squared is 0.32 which is considered to be a strong effect size. This number indicates that
32% of the differences between annual teacher salaries can be explained by regional differences.
Average Teacher Salary
ANOVA Table 5%
Source SS df MS F Fcritical
p-value
Between551.67
8 3 183.897.181
1 2.8068 0.0005 Reject
Within1177.9
5 46 25.608
Total1729.6
3 49Estimates of Group Means
Group Confidence Interval
134.712
5 ± 2.8251 95%2 33.381 ± 2.9405 95%
330.478
6 ± 3.0712 95%
439.596
1 ± 2.7223 95%
Tukey test for pairwise comparison of group means
Excel Project 13
1 r 4 2 2
n - r 46 3 3 q0 3.79 4 Sig 4
T5.5364
8
Next, looking at the percentage of students eligible to take the SAT test in each region,
ANOVA indicates that the null hypothesis must be rejected because there is a statistically
significant difference between the regions. Tukey’s HSD shows that the significant difference is
occurring between regions one and four as well as regions two and four. Region four has a much
higher average percentage of students who were eligible to take the SAT at 63.43. Regions one,
two and three had an average percentage of 30.38, 12.58, and 29.82 respectively. Region two has
the smallest percentage of students eligible to take the SAT. Eta squared is 0.51 which is
considered to be a strong effect size. This number indicates that 51% of the difference in
percentage of students eligible to take the SAT can be explained by regional differences.
Percent of Students Eligible to take the SAT
ANOVA Table 5%
Source SS df MS F Fcritical
p-value
Between
17914.1 3
5971.4
15.988
2.8068
0.0000
Reject
Within17181.
146 373.5
Total35095.
149
Excel Project 14
Estimates of Group MeansGroup Confidence Interval
130.384
6 ±10.789 95%
212.583
3 ± 11.23 95%
329.818
2 ±11.729 95%
463.428
6 ±10.397 95%
Tukey test for pairwise comparison of group means 1
r 4 2 2 n - r 46 3 3 q0 3.79 4 Sig Sig 4
T21.144
4
Next, looking at the SAT scores for math, ANOVA indicates that the null hypothesis
should be rejected because there is a statistically significant difference between the regions.
Tukey’s HSD shows that regions one and two, one and four, two and three as well as two and
four are significantly different from each other. The average SAT math score for region two was
the highest at 555.33. Region’s one, three and four had average scores of 508.54, 499 and 476.79
respectively. Region four had the lowest average SAT math scores. Eta squared is 0.52 which is
considered to be a strong effect size. This number indicates that 52% of the difference in math
SAT scores can be explained by regional differences.
SAT Scores -Math
ANOVA Table 5%
Source SS df MS F Fcritical
p-value
Between
41390.3 3
13797
16.783
2.8068
0.0000
Reject
Within37814.
346
822.05
Excel Project 15
Total79204.
649
Estimates of Group MeansGroup Confidence Interval
1508.53
8 ±16.007 95%
2555.33
3 ± 16.66 95%
3 499 ±17.401 95%
4476.78
6 ±15.424 95%
Tukey test for pairwise comparison of group means 1
r 4 2 Sig 2 n - r 46 3 Sig 3 q0 3.79 4 Sig Sig 4
T31.368
8
Next, looking at the verbal SAT scores for each region, ANOVA indicates that the null
hypothesis should be rejected because there is statistically significant difference between the
regions. Tukey’s HSD test shows the difference is between regions one and two, two and three as
well as two and four. Region two has the highest average verbal SAT score at 492.75. Region’s
one, three and four have an average verbal score of 455.31, 453.27 and 431.36 respectively.
Region four has the lowest average verbal SAT score. Eta squared is 0.41 which is considered to
be a strong effect size. This number indicates that 41% of the difference between the verbal SAT
scores can be explained by regional differences.
SAT Scores -Verbal
ANOVA Table 5%Source SS df MS F Fcritical p-
Excel Project 16
valueBetwee
n24731.
6 38243.
910.56
42.806
80.000
0Rejec
t
Within35898.
446 780.4
Total 6063049
Estimates of Group MeansGroup Confidence Interval
1455.30
8 ±15.596 95%
2 492.75 ±16.233 95%
3453.27
3 ±16.954 95%
4431.35
7 ±15.029 95%
Tukey test for pairwise comparison of group means 1
r 4 2 Sig 2 n - r 46 3 Sig 3 q0 3.79 4 Sig 4
T30.563
8
Finally, looking at total SAT scores for each region, ANOVA indicates that the null
hypothesis should be rejected because there is a statistically significant difference between the
regions. According to Tukey’s HSD test the difference is occurring between regions one and
two, two and three as well as two and four. Region two has the highest average total SAT score
at 1048.08. The average total scores for regions one, three and four are 963.85, 952.27 and
908.14 respectively. Region four has the lowest average total SAT score. Eta squared is 0.47
Excel Project 17
which is considered to be a strong effect size. This number indicates that 47% of the difference
in total SAT scores can be explained by regional differences.
SAT Scores -Total
ANOVA Table 5%
Source SS df MS F Fcritical
p-value
Between 129849 3 43283
13.783
2.8068
0.0000
Reject
Within 14445946
3140.4
Total 27430849
Estimates of Group MeansGroup Confidence Interval
1963.84
6 ±31.285 95%
21048.0
8 ±32.563 95%
3952.27
3 ±34.011 95%
4908.14
3 ±30.147 95%
Tukey test for pairwise comparison of group means 1
r 4 2 Sig 2 n - r 46 3 Sig 3 q0 3.79 4 Sig 4
T61.311
4
In comparing the data given between the regions some interesting conclusions can be
drawn. First, it is noteworthy that while region four has the highest expenditure per pupil as well
as the highest average annual teacher salary, that region also has the lowest SAT scores across
the board. Region two, which had the highest SAT scores, was neither highest nor lowest in
terms of spending per pupil or teacher salary. This indicates that the amount of money spent in
schools may not have a statistically significant relationship to student achievement on the SAT.
Excel Project 18
However, the region with the lowest expenditure per pupil and average annual teacher salary,
region 3, did not have the lowest SAT scores.
It is also interesting to note that region four had a markedly higher percentage of students
who were eligible to take the SAT, whereas region two had the lowest percentage of eligible
students. It is certainly worth further investigation to determine whether the number of students
eligible to take the SAT is impacting the data of this study in some way. For example, it would
be interesting to know what the criterion for eligibility is in each region. Perhaps only the most
successful students (based on a criterion different from SAT scores) were eligible to take the
SAT in region two, whereas region four may have had a more lax criterion, allowing far more
students to take the exam.
Across the data, the effect size was considered high for each area where statistical
significance was discovered. This indicates that for those areas where there is a statistically
significant difference, the practical significance is also high. The highest practical significance in
this study can be found in the comparisons between eligibility percentage in each region as well
as math SAT scores for each region – the percentage of difference in each of those areas which
can be explained by region is over 50%. The lowest area of practical significance in this study is
the difference between regions in terms of annual teacher salary with only 32% of the difference
being explained. The effect sizes of the remaining variables ranged between 41 and 47%.
Part 3
The final phase of analysis of this data is to determine whether there is a correlational
relationship between the key variables. In order to do this the correlational coefficient (r) and
coefficient of determination (r²) must be calculated. A visual representation of this data is in the
Excel Project 19
form of a scatter plot which will indicate at a glance whether correlations between the variables
exist as well as whether the correlation is positive or negative.
The first pair of variables being compared in this analysis is expenditure per pupil and
total SAT scores (Figure 16). The slope of the regression line for these variables is negative, -
20.892, indicating that there is a negative correlation between these two variables, which is to
say that as the expenditure per pupil increases, the total SAT score decreases. Taking a deeper
look at the data, the coefficient of correlation (r) is -0.380 indicating that there is a significant
correlation between these two variables because the critical value of r(48) is 0.361 for an alpha
level of .01. The coefficient of determination (r²) for these variables is 0.145 which indicates that
roughly 15% of the variable,variance in SAT scores, is influenced by expenditure per pupil. This
means that even though there is a correlation between the two variables, it is a low correlation.
Given the fact that only about 15% of the variance in the Y variable (SAT scores) is explained by
the X variable (expenditure per pupil) it is important to continue seeking other variables that may
have stronger correlations and can explain the remaining 85%.
Excel Project 20
Figure 16
The next pair of variables being compared in this analysis is student/teacher ratio and
total SAT scores (Figure 17). The slope of the regression line for these variables is positive,
2.683, indicating that there is a positive correlation between these two variables, which is to say
that as the student teacher ratio increases, the total SAT score also increases. Taking a deeper
look at the data, the coefficient of correlation (r) is 0.081 indicating that there is no significant
correlation between these two variables because the critical value of r(48) is 0.279 for an alpha
Excel Project 21
level of .05. The coefficient of determination (r²) for these variables is 0.006 which indicates that
roughly 0.6% of the variable, SAT scores, is influenced by student/teacher ratio. The fact that
there is no significant correlation between these two variables means that student/teacher ratio,
i.e. class size, is not a factor in determining success of students on the SAT test.
Figure 17
Excel Project 22
The next pair of variables being compared in this analysis is student/teacher ratio and
annual teacher salary (Figure 18). The slope of the regression line for these variables is negative,
-0.003, indicating that there is a negative correlation between these two variables, which is to say
that as the student teacher ratio increases, the annual teacher salary decreases slightly. Taking a
deeper look at the data, the coefficient of correlation (r) is -0.001 indicating that there is no
correlation between these two variables. The coefficient of determination (r²) for these variables
is 0.000 which indicates that almost 0% of the variable, annual teacher salary, is influenced by
student/teacher ratio. The fact that there is no significant correlation between these two variables
means that student/teacher ratio, i.e. class size, is not a factor in determining annual teacher
salary. It is interesting to note that there is not statistically significant correlation between teacher
salary and student/teacher ratio considering the fact that it would seem that school districts who
paid their teachers more would be unable to hire as many teachers as districts who pay less,
therefore increasing class size.
Excel Project 23
Figure 18
The next pair of variables being compared in this analysis is expenditure per pupil and
annual teacher salary (Figure 19). The slope of the regression line for these variables is
positive, .200, indicating that there is a positive correlation between these two variables, which is
to say that as the expenditure per pupil increases, the annual teacher salary also increases. Taking
a deeper look at the data, the coefficient of correlation (r) is 0.870 indicating that there is a
significant correlation between these two variables because the critical value of r(48) is 0.361 for
Excel Project 24
an alpha level of .01. The coefficient of determination (r²) for these variables is 0.757 which
indicates that roughly 76% of the variable, annual teacher salary, is influenced by expenditure
per pupil. This is considered a strong correlation given the fact that 76% of the Y variable
(annual teacher salary) is explained by the X variable (expenditure per pupil). This correlation is
not particularly surprising when it is considered that teacher salary is a component of expenditure
per pupil.
Figure 19
Excel Project 25
SUMMARY
Thorough analysis of the data collected indicates that there is not a statistically significant
relationship between school spending and SAT scores. The fact that the data for the variables
expenditure per pupil, student/teacher ratio and teacher class size is not normally distributed
(possibly due to outliers) and the data for the variables math, verbal and total SAT scores as well
as students eligible to take the SAT follows a bimodal distribution may be impacting the further
analyses.
The Tukey’s HSD tests showed significant differences between regions in all the
variables compared, however analysis of that data raised more questions than answers. While it
was clear that the regions with the highest spending did not have the highest test scores, the
regions with the lowest spending did not have the lowest scores either. This indicates that there
are other variables impacting test scores beyond spending. One possible variable indicated in this
analysis is eligibility to take the SAT test because it appears that the region (two) with the lowest
percentage of eligibility had the highest scores. This variable (eligibility) also had one of the
highest practical significance rates.
In an examination of correlation between key variables, it was determined that while
there is a significant correlation between school spending and SAT scores, that correlation only
accounts for approximately 15% of the variance, leaving 85% unexplained. This indicates again
that spending is not a major contributing factor in success on the SAT test.
Though school spending does not appear to have a statistically significant impact on
student success, based on the data in this study the conclusion can be reached that further
research needs to be done before we can actually determine what the true cause of student
success is for any practical purpose. In conclusion, it appears that that money may not be the
Excel Project 26
only variable that could impact student achievement on the SAT; therefore it would not be wise
to use this study as an argument to decrease school funding.
top related