groupsrs.stat20.project2

10
Exploring the Effects of Exercise on Academic Success Spencer Nelson, Brandon Yu, Kandace Mok, Katherine Delk Introduction In this study, we are using survey sampling techniques on UCLA students to investigate the relationship between the amount of time spent exercising and academic performance. We are using the variables GPA, gender, major type, time spent studying in hours per week, and the number of days per week spent exercising. Ultimately, we would like to see if a student’s GPA is positively affected by devoting time to exercise. Our goal is to use this project to encourage students to maintain a healthy lifestyle, and potentially demonstrate the link between exercising and success in the classroom. Many studies have proven that there are positive by-products from exercising outside of an individual’s physical health. One such study, conducted by researchers at Purdue University [1], indicated that exercise leads to reduced stress levels and this, in turn, makes students more awake (thus, allowing them to study more). An article by the New York times [2] also noted that the more committed a student is to studying, the more likely they are to be committed to exercising as well - we want to see if commitment to exercise promotes a strong work ethic in the classroom. Additional research has proven that children who exercise often are more attentive, have better time management, and have superior memory and problem solving skills, which lead to higher scores on tests. We would like to test if a similar relationship exists within our sample of college students, and see how strong that relationship is. Additionally, we want to see if certain groups, such as different majors, have an influence on academic achievement. Let’s quickly define a few parameters. By “exercise,” we mean activity requiring physical effort, carried out especially to improve health or fitness; examples include weight-lifting, yoga, and sports. We have recorded this variable in days spent exercising per week. Next, we define “academic achievement” strictly as GPA. We hypothesize that we will observe evidence that increased frequency of exercise has a positive effect on a college student’s GPA; in addition, we hypothesize that upon subsetting our data by different categories, such as major type, this trend will still hold. In addition, we also believe that GPA will also be observed to be strongly dependent on other factors, such as hours spent studying per week. In total, we collected 151 responses via surveys conducted through a Google form sent out to peers. Data Analysis Please See Appendix for Enlarged Graphics for Data Analysis and Modeling Sections Firstly, we would like to get a quick glimpse at our data in preparation for our modeling. For example, let’s take a look at the distributions of our various variables of interest. We would like to investigate whether we can realize any type of relationship between certain categories. Some interesting questions we would like to answer include: are high levels of exercise tied to high GPAS; are high frequencies of studying tied to high GPAs; are there differences in the distribution of GPA as you move across different majors? Humanities Quan Science Major Type Distribution (1a) 0 10 20 30 40 50 60 25 64 62 Low Medium High Exercise Level Distribution (1b) 0 20 40 60 80 80 42 29 <= 2 days/wk 3,4 days/wk >= 5 days/wk Low Medium High GPA Level Distribution (1c) 0 10 20 30 40 50 60 70 25 53 73 < 3.19 3.2-3.59 > 3.59 Low Medium High Study Frequency Distribution (1d) 0 10 20 30 40 50 60 33 53 65 <10 hr/wk 10-20 hr/wk >20 hr/wk 1

Upload: spencer-nelson

Post on 11-Apr-2017

297 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: GroupSRS.Stat20.Project2

Exploring the Effects of Exercise on Academic SuccessSpencer Nelson, Brandon Yu, Kandace Mok, Katherine Delk

Introduction

In this study, we are using survey sampling techniques on UCLA students to investigate the relationshipbetween the amount of time spent exercising and academic performance. We are using the variables GPA,gender, major type, time spent studying in hours per week, and the number of days per week spent exercising.Ultimately, we would like to see if a student’s GPA is positively affected by devoting time to exercise. Ourgoal is to use this project to encourage students to maintain a healthy lifestyle, and potentially demonstratethe link between exercising and success in the classroom.

Many studies have proven that there are positive by-products from exercising outside of an individual’sphysical health. One such study, conducted by researchers at Purdue University [1], indicated that exerciseleads to reduced stress levels and this, in turn, makes students more awake (thus, allowing them to studymore). An article by the New York times [2] also noted that the more committed a student is to studying,the more likely they are to be committed to exercising as well - we want to see if commitment to exercisepromotes a strong work ethic in the classroom. Additional research has proven that children who exerciseoften are more attentive, have better time management, and have superior memory and problem solvingskills, which lead to higher scores on tests. We would like to test if a similar relationship exists within oursample of college students, and see how strong that relationship is. Additionally, we want to see if certaingroups, such as different majors, have an influence on academic achievement.

Let’s quickly define a few parameters. By “exercise,” we mean activity requiring physical effort, carried outespecially to improve health or fitness; examples include weight-lifting, yoga, and sports. We have recordedthis variable in days spent exercising per week. Next, we define “academic achievement” strictly as GPA.

We hypothesize that we will observe evidence that increased frequency of exercise has a positive effect on acollege student’s GPA; in addition, we hypothesize that upon subsetting our data by different categories,such as major type, this trend will still hold. In addition, we also believe that GPA will also be observed tobe strongly dependent on other factors, such as hours spent studying per week.

In total, we collected 151 responses via surveys conducted through a Google form sent out to peers.

Data Analysis

Please See Appendix for Enlarged Graphics for Data Analysis and Modeling Sections

Firstly, we would like to get a quick glimpse at our data in preparation for our modeling. For example, let’stake a look at the distributions of our various variables of interest. We would like to investigate whether wecan realize any type of relationship between certain categories. Some interesting questions we would like toanswer include: are high levels of exercise tied to high GPAS; are high frequencies of studying tied to highGPAs; are there differences in the distribution of GPA as you move across different majors?

Humanities Quan Science

Major Type Distribution (1a)

010

2030

4050

60

25

64 62

Low Medium High

Exercise Level Distribution (1b)

Fre

quen

cy

020

4060

80

8042

29

<= 2 days/wk3,4 days/wk>= 5 days/wk

Low Medium High

GPA Level Distribution (1c)

Fre

quen

cy

010

2030

4050

6070

25

53

73< 3.193.2−3.59> 3.59

Low Medium High

Study Frequency Distribution (1d)

Fre

quen

cy

010

2030

4050

60

33

5365

<10 hr/wk10−20 hr/wk>20 hr/wk

1

Page 2: GroupSRS.Stat20.Project2

*A note on how “major types” were divided in Fig.1a

We define Humanities as creative-thinking majors, including writing, political science, and linguistics.Quantitative majors include statistics, mathematics, economics, and engineering. Science majors relate tosubjects tied with the life sciences, including biology, chemistry, and psychology.

Low Exerc. Med Exerc. High Exerc.

GPA and Exercise Levels (Fig. 2a)

010

2030

40

Low GPA (<3.19)Med GPA (3.2−3.59)High GPA (>3.6)

Female Male

GPA and Gender (Fig. 2b)

010

2030

Humanities Quantitative Science

GPA and Major Type (Fig. 2c)

05

1525

35

(2a)/(2b)/(2c) Here we have created barplots to observe how GPA varies across different categories inpreparation for our chi-squared test of independence. Notice how in Fig. 2a, the distribution of GPA doesnot seem to vary very much across different levels of exercise. Similarly in Fig. 2b, we can see that thedistributions of GPA among males and females are not radically different. However, in Fig. 2c, the distributionof GPA seems to change as you move across different major types. In particular, under Humanities, mediumGPA levels makes the largest chunk, while for Science majors, high GPAs is the most prevalent.

Humanities Quantitative Science

01

23

45

67

Major Type and Frequency of Exercise (Fig.3a)

Day

s S

pent

Exe

rcis

ing

per

Wee

k

Humanities Quantitative Science

010

3050

Major Type and Hours Studied (Fig.3b)

Hou

rs S

pent

Stu

dyin

g pe

r W

eek

0 1 2 3 4 5 6 7

010

3050

Study Hours v. Exercise Days (Fig. 3c)

Days Spent Exercising per Week

Hou

rs S

pent

Stu

dyin

g pe

r W

eek

(3a)/(3b) We would like to investigate if there are particular differences among our different majors whichmay be accounting for the varying distributions of GPA. In Fig. 3a, we notice that the distribution of thenumber of days per week spent exercising is quite similar across our three majors; in fact, Quantitative majorsand Science majors have identical distributions. However, by contrast in Fig. 3b, we can see the distributionof the number of hours per week spent studying varies much more; in particular, Humanities majors seem tobe spending less time studying, whereas Quantitative majors have the highest median in hours spent studyingper week. We will investigate the independence of Major Type and Study Levels in the next section.

Modeling

Now, we would like to perform chi-squared tests to observe if there exists independence between our categoriesof interest. For example, we will start by observing if there is any independence between GPA Levels andExercise Levels. We noted in our Data Analysis section that we noticed that, upon visual inspection, theredid not seem to be much variation in the distribution of GPA Levels across different Exercise Levels (see Fig.2a). Thus, we suspect that GPA Levels and Exercise Levels are independent of one another; or in other words,that we cannot predict GPA from Exercise Levels. More formally, we construct our hypothesis as follows.

Ho : GPA Levels and Exercise Levels are independent of one another.

Ha : GPA Levels and Exercise Levels are NOT independent of one another.

Running a chi-squared test of independence yields the following results:

2

Page 3: GroupSRS.Stat20.Project2

0 5 10 15 20

0.00

0.10

Chi−Square Density Graph: df = 4

<−−− p = 0.8248χ2 = 1.5105

−3 −2 −1 0 1 2 3

0.0

0.2

0.4

Standard Normal

dnor

m(x

, 0, 1

) Low Ex (<= 2)Med ExHigh Ex (>=5)Rejection region

At a significance level of α = 0.05, we fail to reject the null hypothesis; there is convincing evidence thatknowing a student’s Exercise Level will not help us predict his or her GPA Level, and that these two variablesare independent. Notice how our standardized residuals, which can be thought of as z-values under a standardnormal curve, stay between our rejection regions.

GPA v. Major Test

Observing Fig.2c from our Data Analysis section, we notice that we do NOT have similar GPA distributionsacross our different major types. In particular, medium GPA seems to make up a large proportion of theobservation in Humanities students compared to students studying Quantitative and Science topics. Wewould like to test if this implies the two variables are not independent. We set up our hypotheses similarly:

Ho : GPA Levels and Major Types are independent of one another.

Ha : GPA Levels and Major Types are NOT independent of one another.

Running a chi-squared test of independence yields the following results:

0 5 10 15 20

0.00

0.10

Chi−Square Density Graph

χ2 = 10.589

p = 0.03159

−3 −2 −1 0 1 2 3

0.0

0.2

0.4

Standard Normal

dnor

m(x

, 0, 1

)

− 2.344

1.9952.856

HumanitiesQuantitativeScienceRejection region

Because our p-value is under the significance level of α = 0.05, we can safely reject the null hypothesis; there isconvincing evidence that GPA Levels and Major Type are NOT independent. We are able to pinpoint whichcategories are statistically significant. Notice how we have a highly negative residual for Science studentsunder our Medium GPA category of -2.34 and a highly positive residual for Science students for our HighGPA category of 1.99. This is an indication that our sample data underestimated the expected number ofScience students in the high GPA category, which was counterbalanced by overestimating the number ofScience students in the medium GPA category. Similarly, for Humanities students, our stray residual of 2.86demonstrates our sample data underestimated the expected number of Humanities students in the MediumGPA category, which was counterbalanced by overestimates in the Low GPA and High GPA categories.

Different Habits among Students of Different Majors?

3

Page 4: GroupSRS.Stat20.Project2

Upon our results which show that GPA and Major are not independent, we would like to investigate if thereare certain habitual differences among students of different majors. In particular, we would like to test ifthere exists independence between a student’s major against two factors: his or her level of exercise and howfrequently he or she studies per week. Let’s investigate exercise as our first variable of interest. Again, we setup the hypotheses:

Ho : Major Types and Exercise Levels are independent of one another.

Ha : Major Types and Exercise Levels are NOT independent of one another.

Running a chi-squared test of independence yields the following results:

0 5 10 15 20

0.00

0.10

Chi−Square Density Graph: df = 4

χ2 = 6.559

p−value = 0.195

−3 −2 −1 0 1 2 3

0.0

0.2

0.4

Standard Normal

dnor

m(x

, 0, 1

) HumanitiesQuantitativeScienceRejection region

At a p-value of 0.195, we fail to reject the null hypothesis; there is, in fact, convincing evidence demonstratingthat Major Types and Exercise Levels are independent of one another. This confirms our first chi-squaredtest, which showed that a student’s GPA and his or her exercise level were not dependent on one another.Notice again how our standardized residuals stay outside of the critical regions of our standard normal graph.If not exercise level, we suspect that there must be another factor influencing the differences in GPA amongdifferent major types. We will now focus our attention on analyzing if there exists independence between astudent’s particular major and how frequently he or she studies per week.

Major Type v. Study Levels

Here, we will be investigating if there is variation in the frequency of a student’s studying based on his or hermajor. Again, we set up our hypotheses similarly:

Ho : Major Types and Study Levels are independent of one another.

Ha : Major Types and Study Levels are NOT independent of one another.

Running a chi-squared test of independence produces the following results:

0 5 10 15 20

0.00

0.10

Chi−Square Density Graph: df = 4

χ2 = 13.123

p = 0.01069

−3 −2 −1 0 1 2 3

0.0

0.2

0.4

Standard Normal

dnor

m(x

, 0, 1

)

− 2.548− 1.987 2.478

2.93

HumanitiesQuantitativeScienceRejection region

4

Page 5: GroupSRS.Stat20.Project2

At a p-value of 0.01, we reject the null hypothesis; there is convincing evidence that Major Levels and StudyLevels are NOT independent. Thus, we have shown that there does, indeed, exists differences in study habitsamong students of different majors. We can observe our residual summary to pinpoint which categories arecontributing to the test’s statistical significance. In particular, notice that in our Quantitative category, ourresidual of 2.48 indicates we vastly underestimated the number of students who study frequently, and thiswas counterbalanced by by an overestimation of Quantitative students who had low frequencies of studying,as indicated by the negative residual of -1.99. In addition, the opposite trend occurred among Humanitiesstudents, where our sample data overestimated the expected number of these students with high frequenciesof studying, indicated by the negative residual of -2.55; this was counterbalanced by the underestimation ofHumanities students with low levels of studying, as indicated by the highly positive residual of 2.93.

Conclusion

Overall, our findings in this study reject our initial hypothesis that we would observe differing distributionsof academic performance among students who exercised at different weekly frequencies. Rather, it appearsthat GPA distribution varies when we categorize students based upon their field of study. Furthermore, upondividing students by major types, we find that it is likely differences in the number of weekly hours dedicatedto studying which accounts for this non-uniform GPA distribution.

Let us revisit some of our statistical results which led to the aforementioned conclusions. After running achi-squared test of independence between student’s GPA levels (low, medium, high) and their weekly exercisefrequency, we obtained a p-value of 0.82, a strong indication that the two categories are independent. Inother words, we do not expect to see substantially variable GPA distributions among students who exerciseat different rates. A similar analysis between GPA levels and Major Types yielded an extremely low p-valueof 0.03; again, this was a very good indicator that we expect the distribution of GPAs to change as we moveacross different major types. Indeed, our residual analysis proved this to be true; for Science students, ahighly positive residual of 1.99 showed our sample data underestimated the expected number of studentsin High GPA category, and this was compensated by overestimating the number of Science students in theMedium GPA category - indicated by a highly negative residual of -2.34. Similarly, a residual of 2.86 indicatedour sample data underestimated the expected number of Humanities students in the Medium GPA category,which was offset by an overestimation of Humanities students in the High GPA category.

We were interested in investigating potential reasons as to why GPA distribution varied across Major Types,so we ran two separate chi-squared tests of independence: Major Type v. Exercise Levels and Major Type v.Study Frequency. Unsurprisingly, our test between Major Type and Exercise Levels yielded a p-value of 0.195,demonstrating that knowing a student’s major does not give us information about his or her frequency ofexercise. This adds a level of confirmation to our first chi-squared test which showed GPA Levels and ExerciseLevels were independent. However, running a chi-squared test between Major Type and Study Frequencyyielded a p-value of 0.01, exemplifying that the distribution of student’s study frequency should be expectedto be different among different majors. Indeed, residual analysis demonstrated that we underestimated thenumber of Humanities students with Low study frequency and overestimated the number of Humanitiesstudents with High frequency of study. This, indeed, aligns with the fact that the Humanities lacked a highproportion of its students in the High GPA category, a strong indication that hours spent studying and GPAare strongly dependent.

Let’s discuss the real-world implications of our results. We have found evidence against the claim that thereis dependency between GPA Level and Exercise Level. And this results does make sense; one would notexpect exercise alone to be a contributor to a high GPA. Some make the claim that students who exercisemore have higher GPAs, because students who exercise more also tend to be more active academically. Forour particular sample, however, as seen in Fig. 3c, whether a student exercises zero days per week or sevendays a week, the distribution of hours spent studying seems fairly uniform. We then conclude that exercisealone has little effect on GPA, and that higher GPAs are largely a byproduct of simply longer hours dedicatedto studying; studies which claim a relationship exists between exercise and GPA likely derive their resultsfrom samples containing students who BOTH study highly frequently AND exercise highly frequently.

5

Page 6: GroupSRS.Stat20.Project2

Appendix

References

[1] A study by Purdue University students investigating the effects of exercise on academic success

http://www.purdue.edu/newsroom/releases/2013/Q2/college-students-working-out-at-campus-gyms-get-better-grades.html

[2] A study by the New York Times investigating the positive effects of exercise on cognitive abilities andmental health.

http://well.blogs.nytimes.com/2010/06/03/vigorous-exercise-linked-with-better-grades/

Below are enlarged graphics from our Data Analysis and Modeling Sections

Humanities Quan Science

Major Type Distribution (1a)

010

2030

4050

60

25

64 62

Low Medium High

Exercise Level Distribution (1b)

Fre

quen

cy

020

4060

80

8042

29

<= 2 days/wk3,4 days/wk>= 5 days/wk

Low Medium High

GPA Level Distribution (1c)

010

3050

70

25

53

73< 3.193.2−3.59> 3.59

Low Medium High

Study Frequency Distribution (1d)

Fre

quen

cy

010

2030

4050

60

33

5365

<10 hr/wk10−20 hr/wk>20 hr/wk

6

Page 7: GroupSRS.Stat20.Project2

Low Exerc. Med Exerc. High Exerc.

GPA and Exercise Levels (Fig. 2a)0

1020

3040

Low GPA (<3.19)Med GPA (3.2−3.59)High GPA (>3.6)

Female Male

GPA and Gender (Fig. 2b)

010

2030

Humanities Quantitative Science

GPA and Major Type (Fig. 2c)

05

1015

2025

3035

7

Page 8: GroupSRS.Stat20.Project2

Humanities Quantitative Science

01

23

45

67

Major Type and Frequency of Exercise (Fig.3a)

Day

s S

pent

Exe

rcis

ing

per

Wee

k

Humanities Quantitative Science

010

2030

4050

60

Major Type and Hours Studied (Fig.3b)

Hou

rs S

pent

Stu

dyin

g pe

r W

eek

0 1 2 3 4 5 6 7

010

2030

4050

60

Study Hours v. Exercise Days (Fig. 3c)

Days Spent Exercising per Week

Hou

rs S

pent

Stu

dyin

g pe

r W

eek

0 5 10 15 20

0.00

0.05

0.10

0.15

Chi−Square Density Graph: df = 4

<−−− p = 0.8248

χ2 = 1.5105

−3 −2 −1 0 1 2 3

0.0

0.1

0.2

0.3

0.4

Standard Normal

dnor

m(x

, 0, 1

)

Low Ex (<= 2)Med ExHigh Ex (>=5)Rejection region

8

Page 9: GroupSRS.Stat20.Project2

0 5 10 15 20

0.00

0.05

0.10

0.15

Chi−Square Density Graph

χ2 = 10.589

p = 0.03159

−3 −2 −1 0 1 2 30.

00.

10.

20.

30.

4

Standard Normal

dnor

m(x

, 0, 1

)

− 2.344

1.995

2.856

HumanitiesQuantitativeScienceRejection region

0 5 10 15 20

0.00

0.05

0.10

0.15

Chi−Square Density Graph: df = 4

χ2 = 6.559

p−value = 0.195

−3 −2 −1 0 1 2 3

0.0

0.1

0.2

0.3

0.4

Standard Normal

dnor

m(x

, 0, 1

)

HumanitiesQuantitativeScienceRejection region

9

Page 10: GroupSRS.Stat20.Project2

0 5 10 15 20

0.00

0.05

0.10

0.15

Chi−Square Density Graph: df = 4

χ2 = 13.123

p = 0.01069

−3 −2 −1 0 1 2 30.

00.

10.

20.

30.

4

Standard Normal

dnor

m(x

, 0, 1

)

− 2.548

− 1.987 2.478

2.93

HumanitiesQuantitativeScienceRejection region

10